WO2021201773A1

WO2021201773A1 - Apparatus and method for implementing a complex-valued neural network

Info

Publication number: WO2021201773A1
Application number: PCT/SG2021/050173
Authority: WO
Inventors: Ai Qun Liu; Hui Zhang; Leong Chuan Kwek; Xudong Jiang
Original assignee: Nanyang Technological University
Priority date: 2020-04-03
Filing date: 2021-03-29
Publication date: 2021-10-07

Abstract

Embodiments of the invention provide an apparatus and method for implementing a complex-valued neural network. The apparatus comprises a plurality of Mach-Zehnder interferometers (MZIs) integrated onto a chip, wherein the plurality of MZIs are arranged to form a signal preparation unit configured to divide an input light from a laser source into a plurality of light beams, and modulate an amplitude and/or a phase of each light beam to generate a plurality of input signals and a reference signal; a weighting unit configured to adjust a configuration of the MZIs which are used to form the weighting unit according to a weight value or a weight matrix to transform the generated input signals into at least one output signal; and a coherent detection unit configured to interfere each of the at least one output signal with the reference signal to obtain information encoded in the magnitude and/or the phase of the at least one output signal.

Description

APPARATUS AND METHOD FOR IMPLEMENTING A COMPLEX- VALUED

NEURAL NETWORK

Cross-Reference To Related Application

[0001] This application claims the benefit of priority of Singapore patent application No. 10202003128Y, filed 3 April 2020, the content of it being hereby incorporated by reference in its entirety for all purposes.

Technical Field

[0002] The invention relates to optical neural networks, more specifically, an apparatus and a method suitable for implementing a complex-valued neural network. The apparatus includes an optical neural chip (ONC).

Background

[0003] Advanced machine learning algorithms, such as artificial neural networks, have received significant attention for their potential applications in key tasks such as image recognition and language processing. Notably, neural networks make heavy use of multiply-accumulate (MAC) operations, which cannot be efficiently executed in existing electronic computing hardware, e.g., central processing unit (CPU), graphics processing unit (GPU), field programmable gate array (FPGA), and so on. Application specific devices for executing MAC operations are preferred. Moreover, currently, a vast majority of existing neural networks rely entirely on real-valued arithmetic. Nevertheless, recent studies suggest that complex-valued arithmetic would significantly improve the performance of neural networks by offering rich representational capacity, fast convergence, strong generalization, and noise-robust memory mechanisms. Unfortunately, conventional electronic computing platforms are incapable of executing algorithms using truly complex-valued operations because complex numbers must be represented by two- dimensional real numbers, which strengthens the inefficiency of the multiply-accumulate operation - the most computationally expensive component of the neural network algorithms. To overcome these hurdles, it has been proposed that the computationally taxing task of implementing neural networks be outsourced to optical computing which is capable of truly complex-valued arithmetic.

[0004] Optical computing offers low power consumption, high computational speed, large information storage, inherent parallelism, and other advantages unparalleled by its electronic counterpart. Several optical implementations of neural networks have been proposed. Among these technologies, photonic chip-based optical neural networks have become increasingly mainstream for their high compatibility, scalability, and stability. This platform has already had notable successes in demonstrating neuromorphic photonic weight banks, all-optical neural networks, and optical reservoir computing. A classical fully connected neural network has been experimentally demonstrated on an integrated silicon photonic chip. Although these optical chips are based on light interference, the neural network algorithms are real-valued, which forfeits the potential of complex-valued neural networks.

[0005] The unique advantage of optical neural networks is their ability to process information in multiple degrees of freedom, e.g., magnitude and phase, using complex- valued arithmetic, leading to more efficient information processing and analysis. Nevertheless, existing optical neural networks have not tapped this potential due to their reliance on classical deep learning algorithms designed for real-valued arithmetic on conventional electronic computers and the practical limitations of spatial optics where maintaining phase stability comes at a significant premium. These real-valued optical neural networks are implemented solely using the intensity information of the optical signals while discarding the phase information, which forfeits a key benefit of optical computing.

[0006] It, therefore, would be desirable to provide an optical neural network suitable for executing complex-valued arithmetic so as to highlight the unique advantages of optical neural networks mentioned above.

Summary of Invention [0007] Various embodiments of the invention provide an apparatus and a method for implementing a complex-valued neural network to realize an on-chip optical complex- valued neural network. The apparatus includes an ONC.

[0008] According to a first aspect of the invention, some embodiments of the invention provide an apparatus for implementing a complex-valued neural network. The apparatus may include a plurality of Mach-Zehnder interferometers (MZIs) integrated onto a chip, wherein the plurality of MZIs are arranged to form a signal preparation unit configured to divide an input light from a laser source into a plurality of light beams, and modulate an amplitude and/or a phase of each light beam to generate a plurality of input signals and a reference signal; a weighting unit configured to adjust a configuration of the MZIs which are used to form the weighting unit according to a weight value or a weight matrix to transform the generated input signals into at least one output signal, and a coherent detection unit configured to interfere each of the at least one output signal with the reference signal to obtain information encoded in the magnitude and/or the phase of the at least one output signal.

[0009] According to the first aspect of the invention, in some embodiments of the invention, the signal preparation unit may be further configured to modulate both the amplitude and the phase of each light beam to generate a plurality of complex-valued input signals.

[0010] According to a second aspect of the invention, some embodiments of the invention provide a method for implementing a complex-valued neural network. The method may include: providing an apparatus for implementing the complex-valued neural network, wherein the apparatus comprises a plurality of MZIs integrated onto a chip, wherein the plurality of MZIs are arranged to form a signal preparation unit, a reference signal generation unit, a weighting unit and a coherent detection unit, wherein the method further comprises: dividing, by the signal preparation unit, an input light from a laser source into a plurality of light beams, and modulating an amplitude and/or a phase of each light beam to generate a plurality of input signals and a reference signal; adjusting, by the weighting unit, a configuration of the MZIs which are used to form the weighting unit according to a weight value or a weight matrix to transform the input signals into at least one output signal, and interfering, by the coherent detection unit, each of the at least one output signal with the reference signal to obtain information encoded in the magnitude and/or the phase of the at least one output signal.

[0011] According to the second aspect of the invention, in some embodiments of the invention, the step of modulating an amplitude and/or a phase of each light beam to generate a plurality of input signals may include: modulating both the amplitude and the phase of each light beam to generate a plurality of complex-valued input signals.

[0012] With the apparatus and method for implementing a complex-valued neural network proposed in various embodiments of the invention, an on-chip optical complex- valued neural network that integrates a signal preparation unit, weighting unit, and a coherent detection unit within a single photonic chip, is realized to highlight the unique advantages of optical neural networks and further provide a benchmark for chip-based complex-valued networks by optical computing. Meanwhiles, previous complications of complex-valued networks, e.g., cumbersome arithmetic on complex numbers, are alleviated by directly passively realizing such operations through optical interference with no computational overhead.

Brief Description of the Drawings

[0013] In the drawings, like reference characters generally refer to like parts throughout the different views. The drawings are not necessarily to scale, emphasis instead generally being placed upon illustrating the principles of the invention. In the following description, various embodiments of the invention are described with reference to the following drawings, in which:

[0014] Fig. 1 shows a schematic view of an apparatus for implementing a complex-valued neural network according to some embodiments of the invention.

[0015] Fig. 2 shows a flowchart illustrating a method for implementing a complex -valued neural network according to some embodiments of the invention.

[0016] Fig. 3A shows a schematic diagram illustrating a scaling architecture of an optical neural network according to some embodiments of the invention. [0017] Fig. 3B shows a schematic diagram illustrating a structure of an ONC for a complex-valued neural network according to one embodiment of the invention.

[0018] Fig. 4A shows a packaged ONC for a complex-valued neural network according to one embodiment of the invention.

[0019] Fig. 4B shows a false-color micrograph of an MZI network with integrated heaters on the ONC shown in Fig. 4A.

[0020] Fig. 4C shows a false-color micrograph of a wave guide-coupled Ge-on-SOI photodetector on the ONC shown in Fig. 4A.

[0021] Fig. 5A is a schematic diagram illustrating a computation process of a complex- valued neuron according to a first embodiment of the invention.

[0022] Fig. 5B is a schematic diagram illustrating the on-chip implementation of complex- valued neurons according to the first embodiment of the invention.

[0023] Figs. 5C and 5D are plots showing the training process of the NAND and XOR gates respectively according to the first embodiment.

[0024] Fig. 6A shows the representative images from each class, demonstrating substantial overlap in features between the three subspecies according to a second embodiment of the invention.

[0025] Fig. 6B shows validation results of the Iris flower classification task when only phase is used to train the complex-valued neural network according to the second embodiment of the invention.

[0026] Fig. 6C shows validation results of the Iris flower classification task when only magnitude is used to train the complex-valued neural network according to the second embodiment of the invention.

[0027] Fig. 6D shows validation results of the Iris flower classification task when both magnitude and phase are used to train the complex-valued neural network according to the second embodiment of the invention.

[0028] Fig. 7A is a schematic diagram illustrating an optical neural network according to a third embodiment of the invention.

[0029] Fig. 7B is a plot showing a comparison of performances of a complex-valued network and a real-valued neural network implemented on the same chip according to the third embodiment of the invention. [0030] Fig.7C is a plot showing the performances of a complex-valued neural network implemented on an ONC in five different scenarios according to the third embodiment of the invention.

[0031] Fig. 8 is a schematic diagram showing a cross-section of a designed waveguide according to one embodiment of the invention.

[0032] Fig. 9A is a plot showing I-V characteristics of the chip-integrated heater as shown in Fig. 8.

[0033] Fig. 9B is a plot showing polynomial fittings of a relationship between electrical power and applied current on the non-resistive heater as shown in Fig. 8. [0034] Fig. 10 shows plots showing the comparison between phase shifters with and without isolation trench according to the embodiment shown in Fig. 8.

[0035] Fig. 11A shows a schematic diagram illustrating the main components in the experimental setup according to one embodiment of the invention.

[0036] Fig. 1 IB is a raw picture showing the central part of the ONC with deep isolation trenches according to the embodiment in Fig. 11 A.

[0037] Fig. llC is a photograph showing the chip testing bed in an experiment according to the embodiment shown in Fig. 11 A.

[0038] Fig. 11D and 11E show SEM pictures of a grating coupler and a multimode interferometer (MMI) respectively according to the embodiment shown in Fig. 11 A. [0039] Fig. 12 shows plots illustrating fitting curves of the calibration of several exemplary phase shifters according to some embodiments of the invention.

[0040] Fig. 13A is a schematic diagram illustrating the arrangement of MZIs in an ONC according to one embodiment of the invention.

[0041] Fig. 13B shows measurement results of the exemplary binary input vector according to the embodiment shown in Fig. 13 A.

[0042] Fig. 14 shows a decomposition process of a unitary matrix U according to one embodiment of the invention.

[0043] Fig. 15A and Fig. 15B are plots illustrating chip outputs by intensity measurements according to one embodiment of the invention, and Fig. 15C and Fig. 15D show the decision surfaces from the second-layer outputs. [0044] Fig. 16A shows a schematic diagram illustrating the principle of phase -diversity homodyne detection according to some embodiments of the invention.

[0045] Fig. 16B shows a schematic diagram illustrating the process of an on-chip homodyning using an MZI and the balanced detector according to some embodiments of the invention.

[0046] Fig. 16C are plots showing exemplary results of the coherent detection by varying the signal phase according to some embodiments of the invention.

[0047] Fig. 17A is a schematic diagram illustrating a general structure of a multi-layered neural network according to some embodiments of the invention. [0048] Figs. 17B and Fig. 17C are schematic diagrams illustrating multi-layered neural networks realized by a feedforwarding way and recurrent way respectively.

[0049] Figs. 18A and Fig. 18B are plots showing the training process of the NAND and XOR gates respectively according to the first embodiment.

[0050] Figs. 19A-19D are plots showing loss convergence of a single complex-neuron on logic gate tasks according to the first embodiment.

[0051] Fig. 20A is a diagram showing a decision surface formed by a trained complex- valued neuron according to the first embodiment; Fig. 20B is a diagram showing a decision surface formed by a real-valued neuron for comparison.

[0052] Fig. 21 is a plot showing the training curves of an Iris classification task when the proposed complex-valued model is used according to the second embodiment of the invention and when the real-valued model is used according to the second embodiment of the invention.

[0053] Fig. 22 are diagrams showing the decision boundaries of a real-valued layer for the Iris classification according to the second embodiment of the invention. [0054] Fig. 23 are diagrams showing the decision boundaries of a complex-valued layer with intensity detection for the Iris classification according to the second embodiment of the invention.

[0055] Fig. 24A and 24B are diagrams showing the decision boundaries of a complex- valued layer for two datasets, circles, and half-moons, respectively, according to some embodiments of the invention. [0056] Fig. 25 is a diagram showing a confusion matrix of complex-valued MLP with a real interface according to the second embodiment of the invention.

Detailed Description

[0057] The following detailed description refers to the accompanying drawings that show, by way of illustration, specific details, and embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention. Other embodiments may be utilized, and structural, logical, and electrical changes may be made without departing from the scope of the invention. The various embodiments are not necessarily mutually exclusive, as some embodiments can be combined with one or more other embodiments to form new embodiments.

[0058] Embodiments described in the context of one of the methods or apparatuses are analogously valid for the other methods or apparatuses. Similarly, embodiments described in the context of a method are analogously valid for an apparatus, and vice versa.

[0059] Features that are described in the context of an embodiment may correspondingly be applicable to the same or similar features in the other embodiments. Features that are described in the context of an embodiment may correspondingly be applicable to the other embodiments, even if not explicitly described in these other embodiments. Furthermore, additions and/or combinations and/or alternatives as described for a feature in the context of an embodiment may correspondingly be applicable to the same or similar feature in the other embodiments.

[0060] In the context of various embodiments, the articles “a”, “an” and “the” as used with regard to a feature or element include a reference to one or more of the features or elements. [0061] In the context of various embodiments, the term “and/or” includes any and all combinations of one or more of the associated listed items.

[0062] In the context of various embodiments, the terms “first,” and “second,” etc. are used merely as labels and are not intended to impose numerical requirements on their objects. [0063] In the context of various embodiments, the term “configured to” is interchangeable with “operative” or “adapted to”.

[0064] In the context of various embodiments, the term “about” or “approximately” as applied to a numeric value encompasses the exact value and a reasonable variance. [0065] Fig. 1 shows a schematic view of an apparatus 100 for implementing a complex- valued neural network according to some embodiments of the invention. The apparatus 100 may include a plurality of Mach-Zehnder interferometers (MZIs) integrated onto a chip, e.g., a photonic chip, wherein the plurality of MZIs are arranged to form a signal preparation unit 110, a weighting unit 120, and a coherent detection unit 130. The signal preparation unit 110 may be configured to divide an input light from a laser source 101 into a plurality of light beams, and modulate amplitude and/or phase of each light beam to generate a plurality of input signals and a reference signal. The weighting unit 120 may be configured to adjust a configuration of the MZIs which are used to form the weighting unit according to a weight value or a weight matrix to transform the generated input signals into at least one output signal. The coherent detection unit 130 may be configured to interfere each of the at least one output signal with the reference signal to obtain information encoded in the magnitude and/or phase of the at least one output signal.

[0066] In some embodiments, when the apparatus 100 is used for implementing a complex- valued neural network, the signal preparation unit 110 may be further configured to modulate both the amplitude and the phase of each light beam to generate a plurality of complex-valued input signals. In these embodiments, the reference signal may also be complex-valued.

[0067] In some embodiments, when the apparatus 100 is used for implementing a real- valued neural network, the signal preparation unit 110 may be further configured to modulate the amplitude of each light beam and set a phase difference between two adjacent light beams to zero to generate a plurality of real-valued input signals. In these embodiments, the reference signal may also be real-valued.

[0068] In some embodiments, the weighting unit 120 may be further configured to adjust the configuration of the MZIs which are used to form the weighting unit based on a feedback signal determined by a processor 102 based on the information obtained by the coherent detection unit 130 from the at least one output signal, specifically, based on the information encoded in the magnitude and/or the phase of the at least one output signal from the coherent detection unit 130.

[0069] In some embodiments, each MZI of the signal preparation unit 110, the weighting unit 120, and the coherent detection unit 130 may include a first beam splitter (BS)-phase shifter (PS) pair and a second BS-PS pair. The first BS-PS pair may include a first BS and a first tunable PS, and the second BS-PS pair may include a second BS and a second tunable PS. The first BS may be configured to divide a received light beam into a first sub-beam and a second sub-beam which may have a fixed phase difference, e.g. π/2, and the first tunable PS may be configured to add a first phase change to a path of the first sub-beam. The second BS may be configured to combine two received light beams, and the second tunable PS may be configured to add a second phase change to a path of the combined light beam from the second BS.

[0070] For example, in the signal preparation unit 110, each MZI may be formed by the first BS-PS pair followed by the second BS-PS pair. That is to say, each MZI of the signal preparation unit 110 may first divide, by the first BS, a received light beam into a first sub- beam and a second sub-beam with a phase difference of π/2, then add a first phase change to a path of the first sub-beam by the first tunable PS, next combine the first sub-beam with a changed phase and the second sub-beam by the second BS to generate a combined light beam, and add, by the second tunable PS, a second phase change to a path of the combined light beam from the second BS.

[0071] In the weighting unit 120, an MZI may be formed by the second BS-PS pair followed by the first BS-PS pair, or vice versus. When the MZI is formed by the second BS-PS pair followed by the first BS-PS pair, the MZI of the weighting unit 120 may combine two received light beams into a combined light beam first by the second BS, then add a second phase change to a path of the combined light beam by the second tunable PS, next divide the combined light beam from the second tunable PS into a first sub-beam and a second sub-beam with a phase difference of π/2, then add a first phase change to a path of the first sub-beam by the first tunable PS.

[0072] In some embodiments, each of the first BS and the second BS may have a transmissivity of 50:50. Preferably, each of the first BS and the second BS is a multimode interferometer (MMI).

[0073] To tune a phase of a light beam by the first tunable PS or the second tunable PS, the apparatus 100 may further include a plurality of components which are integrated onto the chip, wherein each of the plurality of components is coupled to the first tunable PS or the second tunable PS of an MZI on the chip to tune a phase of a light beam by the first tunable PS or the second tunable PS coupled thereto.

[0074] In some embodiments, the plurality of components may include a plurality of titanium nitride (TiN) heaters. The number of the TiN heaters may be twice the number of the MZIs in the apparatus 100, i.e., each tunable PS in the apparatus 100 has a corresponding TiN heater provided to realize the phase change of a light beam. Preferably, to improve the heating efficiency and reduce crosstalk between adjacent heaters, a plurality of isolation trenches may be formed between two adjacent TiN heaters on the chip.

[0075] Alternatively, in some embodiments, the plurality of components may include a plurality of PIN modulators. The PIN modulators may include carrier injection modulators and/or carrier depletion modulators, which can introduce an electro-optical effect.

[0076] In some embodiments, the weighting unit may be further configured to adjust the configuration of the MZIs which are used to form the weighting unit by adjusting the first tunable PS and/or the second tunable PS of each of the MZIs which are used to form the weighting unit, e.g., adjusting the first phase change added by the first tunable PS and/or the second phase change added by the second tunable PS.

[0077] In some embodiments, the apparatus 100 may further include at least one balanced detector integrated onto the chip, wherein each of the at least one balanced detector is configured to detect an intensity of an interfering light signal generated by the coherent detection unit 130. The interfering light signal refers to the signal generated by interfering one output signal from the weighting unit 120 with the reference signal.

[0078] In a first embodiment, when the apparatus 100 is used to perform a logic gate with a single complex-valued neuron, the signal preparation unit 110 may include three MZIs configured to divide the input light into four light beams and modulate the magnitude of each light beam to generate three input signals and the reference signal, wherein the three input signals include two logic inputs and one bias. The logic gate may be an AND, OR, NAND, or XOR gate.

[0079] In a second embodiment, when the apparatus 100 is used to perform a classification of dataset Iris, the signal preparation unit 110 may include five MZIs configured to divide the input light into six light beams and modulate the magnitude and/or the phase of each light beam to generate five input signals and the reference signal, wherein the five input signals comprise four inputs and one bias. Further, the weighting unit 120 may include 12 MZIs which are adjusted according to a 4 by 3 weight matrix to transform the generated input signals into three output signals. In this embodiment, the apparatus 100 may include a single layer complex-valued neural network which may be trained using any one of the following parameters: three phase parameters {φ1, φ2, φ3 }, or three magnitude parameters {ρ1, ρ2, ρ3 } , or a combination of phase and magnitude parameters {ρ1, (ρ2,φ 1 ), (ρ2,<φ2)}. [0080] In a third embodiment, when the apparatus 100 is used to perform a handwriting recognition task, the apparatus 100 may include a multi-layer perceptron formed by a plurality of single-layered complex-valued networks arranged in a cascading way or a recurrent way.

[0081] Various embodiments of the invention also provide a method for implementing a complex-valued neural network. Fig. 2 shows a flowchart illustrating a method 200 for implementing a complex-valued neural network according to some embodiments of the invention.

[0082] At Block 201, an apparatus for implementing the complex-valued neural network, e.g. the apparatus 100 shown in Fig. 1, is provided, wherein the apparatus includes a plurality of MZIs integrated onto a chip, wherein the plurality of MZIs are arranged to form a signal preparation unit, a weighting unit, and a coherent detection unit.

[0083] At Block 202, an input light from a laser source, e.g. a coherent laser, is divided into a plurality of light beams, and an amplitude and/or a phase of each light beam is modulated to generate a plurality of input signals and a reference signal by the signal preparation unit of the apparatus.

[0084] At Block 203, a configuration of the MZIs which are used to form the weighting unit is adjusted by the weighting unit of the apparatus according to a weight value or a weight matrix to transform the input signals into at least one output signal.

[0085] At Block 204, each of the at least one output signal from the weighting unit is interfered with the reference signal by the coherent detection unit of the apparatus to obtain information encoded in the magnitude and/or the phase of the at least one output signal. [0086] It should be noted that the apparatus and method proposed in various embodiments of the invention may be used for implementing both the complex-valued neural network and the real-valued neural network. [0087] In some embodiments, when the apparatus is used for implementing a complex- valued neural network, both the amplitude and the phase of each light beam may be modulated by the signal preparation unit to generate a plurality of complex-valued input signals. In these embodiments, the reference signal may also be complex- valued.

[0088] In some embodiments, when the apparatus is used for implementing a real-valued neural network, the amplitude of each light beam may be modulated by the signal preparation unit and a phase difference between two adjacent light beams may be set to zero by the signal preparation unit to generate a plurality of real -valued input signals. In these embodiments, the reference signal may also be real-valued.

[0089] In some embodiments, the method may further include that the configuration of the MZIs which are used to form the weighting unit is further adjusted based on a feedback signal determined by a processor. The feedback signal may be determined by the processor based on the information obtained by the coherent detection unit from the at least one output signal, i.e., the information encoded in the magnitude and/or the phase of each output signal from the coherent detection unit.

[0090] In some embodiments, each MZI of the apparatus may include a first BS-PS pair and a second BS-PS pair. The first BS-PS pair may include a first BS and a first tunable PS, and the second BS-PS pair may include a second BS and a second tunable PS. The first BS may be configured to divide a received light beam into a first sub-beam and a second sub-beam which may have a fixed phase difference, e.g., π/2, and the first tunable PS may be configured to add a first phase change to a path of the first sub -beam. The second BS may be configured to combine two received light beams, and the second tunable PS may be configured to add a second phase change to a path of the combined light beam from the second BS. Specifically, when a light beam is received by the first BS-PS pair. The light beam may be divided into a first sub-beam and a second sub-beam first by the first BS, wherein the first sub-beam and the second sub-beam may have a fixed phase difference, e.g., π/2, and then a first phase change may be added to a path of the first sub-beam by the first tunable PS. When two light beams are received by the second BS-PS pair, the two light beams may be combined first by the second BS, and then a second phase change may be added to a path of the combined light beam by the second tunable PS. [0091] In the signal preparation unit, each MZI may be formed by the first BS-PS pair followed by the second BS-PS pair. The process of generating the plurality of the input signals and the reference signal may include: dividing, by the first BS of each MZI, a received light beam into a first sub-beam and a second sub-beam with a fixed phase difference, e.g., π/2, then adding a first phase change to a path of the first sub-beam by the first tunable PS, next combining the first sub-beam with changed phase and the second sub- beam by the second BS to generate a combined light beam, and then adding, by the second tunable PS, a second phase change to a path of the combined light beam from the second BS. The plurality of the MZIs in the signal preparation unit may be arranged to perform the steps mentioned above one by one to generate the plurality of input signals and the reference signal.

[0092] In the weighting unit, an MZI may be formed by the second BS-PS pair followed by the first BS-PS pair, or vice versus. When the MZI is formed by the second BS-PS pair followed by the first BS-PS pair, the MZI of the weighting unit may perform the following steps: combining two received light beams into a combined light beam first by the second BS, then adding a second phase change to a path of the combined light beam by the second tunable PS, next dividing the combined light beam from the second tunable PS into a first sub-beam and a second sub-beam with a fixed phase difference, e.g., π/2, and then adding a first phase change to a path of the first sub-beam by the first tunable PS.

[0093] In some embodiments, each of the first BS and the second BS may have a transmissivity of 50:50. Preferably, each of the first BS and the second BS is a multimode interferometer (MMI).

[0094] Accordingly, the step of adjusting the configuration of the MZIs which are used to form the weighting unit may further include: adjusting the first tunable PS and the second tunable PS of each of the MZIs which are used to form the weighting unit. Specifically, the step of adjusting the configuration of the MZIs may include adjusting the first phase change added by the first tunable PS and/or the second phase change added by the second tunable PS.

[0095] To tune a phase of a light beam by the first tunable PS or the second tunable PS, the step of providing the apparatus may further include providing the apparatus which further includes a plurality of components integrated onto the chip, wherein each of the plurality of components is coupled to the first tunable PS or the second tunable PS of an MZI on the chip to tune a phase of a light beam by the first tunable PS or the second tunable PS coupled thereto. In some embodiments, the plurality of components may include a plurality of TiN heaters. Preferably, to improve the heating efficiency and reduce crosstalk between adjacent heaters, a plurality of isolation trenches may be formed between two adjacent TiN heaters on the chip.

[0096] Alternatively, in some embodiments, the plurality of components may include a plurality of PIN modulators. The PIN modulators may include carrier injection modulators and/or carrier depletion modulators, which can introduce an electro-optical effect.

[0097] In some embodiments, the step of providing the apparatus may further include providing the apparatus which further includes at least one balanced detector integrated onto the chip, wherein the method may further include: detecting, by each of the at least one balanced detector, an intensity of an interfering light signal generated by the coherent detection unit.

[0098] In the first embodiment, when the apparatus is used to perform a logic gate with a single complex-valued neuron, the signal preparation unit may include three MZIs, the step of dividing the input light from a laser source may include: dividing, by the three MZIs of the signal preparation unit, the input light into four light beams, and modulating the magnitude of each light beam to generate three input signals and the reference signal, wherein the three input signals comprise two logic inputs and one bias. In this embodiment, the three MZIs are arranged to perform the signal generation process one by one. The logic gate may be an AND, OR, NAND, or XOR gate.

[0099] In the second embodiment, when the apparatus is used to perform a classification of dataset Iris, the signal preparation unit may include five MZIs. The step of dividing the input signal from a laser source may include: dividing, by the five MZIs of the signal preparation unit, the input light into six light beams, and modulating the magnitude and phase of each light beam to generate five input signals and the reference signal, wherein the five input signals comprise four logic inputs and one bias. In this embodiment, the five MZIs are arranged to perform the signal generation process one by one. Further, the weighting unit may include 12 MZIs in this embodiment. The step of adjusting the configuration of the MZIS may include: adjusting the configuration of the 12 MZIs which are used to form the weighting unit according to a 4 by 3 weight matrix to transform the generated five input signals into three output signals. Accordingly, the step of interfering each of the at least one output signal with the reference signal includes interfering each of the three output signals with the reference signal to obtain information encoded therein, i.e., information encoded in the magnitude and/or the phase of each output signal. In the second embodiment, the step of providing the apparatus may include: providing the apparatus which includes a single layer complex-valued neural network, wherein the single layer complex-valued neural network may be trained using any one of the following parameters: three phase parameters {φ1,φ2 , φ3 } , or three magnitude parameters {ρ1, ρ2,ρ3), or a combination of phase and magnitude parameters {ρ1, (ρ2,φ1), (ρ2,<φ2)}.

[00100] In the third embodiment, when the apparatus is used to perform a handwriting recognition task, the step of providing the apparatus may include: providing an apparatus with a multi-layer perceptron formed by a plurality of single-layered complex-valued networks arranged in a cascading way or a recurrent way.

[00101] Fig. 3A is a schematic diagram illustrating a scaling architecture of an optical neural network according to some embodiments of the invention. Referring to Fig. 3A, the optical neural network includes an input layer, multiple hidden layers, and an output layer. In a complex-valued architecture, inputs, i.e. { x₁, x₂, ..., x_m}, may be encoded and manipulated by both magnitude and phase, i.e. {(ρ₁, θ_{1 )}, (ρ₂, θ₂),.· . , (ρ_w, θ_{N )}}, during the initial input preparation and network evolution to generate the outputs {y₁, y_2j , y_k}. Fig. 3B is a schematic diagram illustrating a structure of an ONC 300 for a complex-valued neural network according to one embodiment of the invention. Referring to Fig. 3B, the ONC 300 includes a plurality of MZIs which are arranged to form a signal preparation unit, a weighting unit, and a coherent detection unit on a single chip, wherein the signal preparation unit is formed by the MZIs in region I and region II as shown in Fig. 3B, the weighting unit is formed by the MZIs in region III, and the coherent detection unit is formed by the MZIs in region IV. A coherent laser is used to generate an input light. The ONC 300 is essentially a multiport interferometer with MZIs arranged in a specific manner. Each MZI of the ONC 300 includes two BS-PS pairs. The functions of the two BS-PS pairs have been described above with the embodiments shown in Fig. 1 and Fig. 2, which will not be repeated here. The BS of each BS-PS pair has a fixed transmissivity of 50:50 and the PS of each BS-PS pair may be thermally modulated to tune the phase of a received light beam. [00102] Referring to Fig. 3B, the input light from a coherent laser is coupled into the ONC 300 from a bottom port on the chip. The MZIs marked in regions I, II, III, and IV are used to realize the functions of the signal preparation unit, the weighting unit, and the coherent detection unit respectively.

[00103] The functions of the signal preparation unit are realized by the MZIs marked in region I and region II. The input light division and modulation are realized by the MZIs marked in region I, and the MZI in region II is used to generate the reference signal that will later be used for coherent detection. The on-chip light division makes sure that the light beams propagating along different optical paths have the same polarisation and share a stable phase difference. The input modulation is dictated by the machine learning task. When the ONC 300 is used to perform tasks with real-valued inputs, the light beams may be modulated by the magnitude and the phase differences between paths of different light beams are set to zero. When the ONC 300 is used to perform tasks with complex-valued inputs, the modulation includes both magnitude attenuation and path-dependent phase rotations. After the signal preparation process, four input signals and one reference signal are generated in this embodiment.

[00104] The functions of the weighting unit are realized by the MZIs marked in region III. The generated input signals travel through the 6 x 6 optical neural network marked in region III. An N-mode network realizes the weight matrix multiplication by transforming the input signals into an output signal according to

, and the transform matrix of each MZI is simplified according to

Equation (1):

Where θ and Φ are two tunable variables. An optical network with N input signals realizes arbitrary N X N unitary weight matrices U (N) by varying the configuration of the MZIs in region III through adjusting the tunable phase shifters of the MZIs in region III, e.g., through adjusting a phase change added to the received light beam by the tunable phase shifter. The network may be multi-layered by cascading the optical networks and being activated by a feedback-clocked strategy. The architecture could be either feedforwarding or recurrent.

[00105] The functions of the on-chip coherent detection unit are realized by the MZIs marked in region IV. The electrical complex signals are subtracted from the output signal generated by the weighting unit. The output signals from the weighting unit may include information encoded in both magnitude and phase, whereas conventional intensity detection techniques can only access information encoded in the magnitude. In this embodiment, the coherent detection unit is configured to interfere each of the output signals from the weighting unit with the reference signal generated by the signal preparation unit to obtain the encoded information in both the magnitude and the phase of the output signals. It should be noted that in some other embodiments of the invention, the input, reference, and output signals may be real-valued signals, and accordingly, the coherent detection unit may be configured to only obtain the information encoded in the magnitude of the output signals.

[00106] Each of the output signals from the weighting unit is interfered with the reference signal and converted to an electrical signal by a photodetector, i.e. the balanced detector, integrated on the chip. The electrical signals amplified by a gainable trans- impedance amplifier (TIA) are then acquired and processed by a classical processor. Feedback signals may be generated by the processor and sent back to the weighting unit of the ONC 300 to adjust the configurations of the MZIs which are used to form the weighting unit.

[00107] Fig. 4A shows a packaged ONC for a complex-valued neural network according to one embodiment of the invention. Fig. 4B shows a false-color micrograph of an MZI network with integrated heaters on the ONC shown in Fig. 4A. Figure 4C shows a false- color micrograph of a waveguide-coupled Ge-on-SOI photodetector on the ONC shown in Fig. 4A. In this embodiment, the ONC with 8 modes and 56 phase shifters is provided. Different configurations of the ONC may be used in other embodiments. Each MZI includes two BS-PS pairs. The two BSs have a transmissivity of 50:50 and are realized by MMIs. The two PSs include an inner phase shifter Q and an outer phase shifter Φ. All the PSs are thermally tuned with integrated TiN heaters bonded to a printed circuit board (PCB). The TiN heaters may be calibrated and fitted with an average R-square value of 0.99.

[00108] Further details of some embodiments of the invention are provided in supplementary information below. For example, the characteristics of the phase shifters in some embodiments of the invention are provided in supplementary information 1 and 2 below, as shown in Figs. 8 and 9A-9B and Table SI. Experimental details of the signal preparation unit and the weighting unit according to some embodiments of the invention are provided in supplementary information 3 and 4 below, as shown in Figs. 10-15D. Details of the coherent detection conducted on the chip according to some embodiments of the invention are provided in supplementary information 5 below, as shown in Figs. 16A- 16C.

[00109] In some embodiments of the invention, the formulation of a complex-valued neuron is similar to that of a conventional neuron model, except that all the parameters and variables are complex-valued, and the computation employs complex-valued arithmetic. The neuron is built by weighting each input with a complex number. The weighted inputs are summed up and processed by an activation function. The output of the neuron is expressed according to Equation (2):

where the weights w_i and bias b are in general complex numbers. Each input x_i to the neuron can either be complex-valued or real-valued.

[00110] The ONC implementation of such complex- valued computations is benchmarked in the following three separate tasks. The results are compared with a similar configured optical chip that computes only on real values.

• Task 1: The implementation of fundamental two-bit logic gates using a single complex neuron. Notably, this includes XOR gate, which cannot be accomplished by a real-valued neuron. A three-layered real-valued neural network is usually required for solving the XOR problem.

• Task 2: The use of a complex-valued layer to classify Iris flowers into three possible subspecies using the real-world Iris dataset to illustrate an improvement over real-valued counterparts. • Task 3: The task of handwriting recognition using a complex -valued network configured on the ONC. The benchmarks based on the MNIST database illustrate that the complex-valued network attains much higher accuracy than its real-valued counterpart, even when the neural chip was only given real-valued inputs and constrained to deliver real-valued outputs.

Task 1: Logic Gate Realization

[00111] The logic gate task is a toy task, in which the logic gates are emulated by complex-valued networks to showcase the apparatus proposed in embodiments of the invention. In implementing fundamental logic gates, the inputs are real-valued or complex- valued with possible phases constrained to 0 or π , the bias b may be treated as an additional constant input 1, and an additional complex-valued weight b is assigned. Equation (2) is simplified to

, where the weight vector

and the input vector

The weight matrix is updated after each iteration by the gradient

, where η is the learning rate, Ŷ is the expected output and y is the actual output. The mapping from complex-valued output to logical value is predefined as odd quadrants to logical value “0” and even quadrants to logical value “1” for all logic tasks. Fig. 5 A is a schematic diagram illustrating a computation process of a complex-valued neuron according to the first embodiment of the invention. Fig. 5B is a schematic diagram illustrating the on-chip implementation of complex-valued neurons according to the first embodiment of the invention. Three input ports are encoded by magnitude, two of which are for the logical inputs and the remaining one is the constant 1 for bias. The weight vector

is implemented on the chip by configuring three MZIs that incorporate 6 tuneable parameters {θ_i, ø_i}, i=1 ,2,3. Both magnitude and phase are modulated in the network evolution. The output signal is measured phase sensitively.

[00112] Figs. 5C and 5D are plots showing the training process of the NAND and XOR gates respectively according to the first embodiment. 10 iterations are conducted and recorded for each logic gate. Quadrants represent logical value “0” and those represent logical value “1” are marked. Being processed by a complex-valued neuron, each of the four different possible combinations of logical inputs converges from a random starting point to the correct end point, via a continuous attenuation of magnitude and phase rotation. The arithmetic loss between the predefined expectations and the results approximated by the complex neuron converges to zero for both the real part and imaginary part of the output, which is shown in Fig. 19A-19D in supplementary information 7 below. Meanwhile, the final classification results are consistent with the truth tables as shown in Table S2 in the supplementary information 6 below. The complex-valued neuron can solve the general XOR problem as shown in Figs. 20A-20B in the supplementary information 5 below. By demonstrating logic gate tasks using complex-valued neurons implemented on the ONC proposed in various embodiments of the invention, the ability of the proposed ONC to solve a linear task and certain tasks that are linearly inseparable in the real domain like the XOR gate has been illustrated.

Task 2: Classification of Dataset Iris

[00113] The second benchmark of the proposed ONC is the Iris flower classification. Here the task is to classify a given Iris flower into one of the three possible subspecies: setosa, versicolor, and virginica, based on four real inputs, i.e. the length and width of the petals and sepals. The non-triviality of this task is that the three species are indistinguishable by any single one of the four features. Fig. 6A shows the representative images from each class, demonstrating substantial overlap in features between the three subspecies according to the second embodiment of the invention.

[00114] The ONC in this embodiment is configured with trained complex-valued weight matrices. Fig. 6B shows validation results of the Iris flower classification task when only phase, i.e., only {φ₁, φ₂, φ ₃], is used to train the complex-valued neural network according to the second embodiment of the invention. Fig. 6C shows validation results of the Iris flower classification task when only magnitude, i.e., only {ρ₁, ρ₂, ρ₃], is used to train the complex-valued neural network according to the second embodiment of the invention. Fig. 6D shows validation results of the Iris flower classification task when both magnitude and phase, i.e., the combinations of the magnitude and the phase {ρ₁, (ρ₂, φ₁), (ρ₂, φ₂)}}, are used to train the complex- valued neural network according to the second embodiment of the invention. The datapoints are collected when validating training datasets and classifying testing datasets. The complex-valued outputs of neurons are shown in two-dimensional Cartesian plots. The circle/diamond/square markers in these three figures are the classification results of the training dataset, while the triangle markers represent the results of the blinded testing dataset. In each figure, the species “setosa” is clearly distinguished from the rest, and the species “versicolor” and “virginica” are separated into two clusters with negligible overlap. Data points that were wrongly classified in the blind test are highlighted using black circles.

[00115] In this embodiment, a single complex-valued layer is benchmarked against its real-valued counterpart. A complex-valued layer achieves an accuracy of 99.3%, which outperforms the real-valued layer which has an accuracy of 97.3%. The improvement in accuracy lies in the fact that a real-valued single layer network can only form a linear decision boundary between two classes, whereas a complex -valued single layer network overcomes this limitation. Please refer to supplementary information 8 below, as shown in Fig. 22 and Fig. 23, for further information related to this task.

Task 3: Handwriting Recognition with a complex- valued multilayer perceptron [00116] Single-layered complex-valued neural networks built on the ONC proposed in embodiments of the invention are employed to build a multilayer perceptron (MLP). The MNIST dataset for the task of classifying handwritten digits is used for training and testing. Fig. 7A is a schematic diagram illustrating an optical neural network according to a third embodiment of the invention. As shown in Fig. 7A, the network includes an input layer Win, a hidden layer W, and an output layer Wout. The 28 X 28 grayscale image is stretched to a 784 X 1 vector and compressed by the input layer into 4 inputs to be fed into the 4 x 4 hidden layer. The hidden layer is trained off-chip and implemented on the ONC. The output layer maps the 4 outputs to 10 classes that represent digits from 0 to 9. Fig. 7B is a plot showing a comparison of performances of a complex-valued network and a real-valued neural network implemented on the same chip according to the third embodiment of the invention. The solid lines C 1 and C2 represent the results of the complex-valued algorithm while the dashed lines R1 and R2 represent the results of the real-valued algorithm. The solid line Cl and the dashed line R1 represent the accuracy of the training, and the solid line C2 and dashed line R2 represent the cost of the training. As shown in Fig. 7B, the complex-valued algorithm achieves a training accuracy of 93.1% and a testing accuracy of 90.6%, while the real-valued algorithm has a training accuracy of 84.3% and a testing accuracy of 82.1%. Thus, the complex-valued neural network significantly outperforms its real-valued counterpart. Besides, a faster convergence is observed in the complex-valued algorithm.

[00117] To verify the strength of the complex-valued matrix and the complex-valued arithmetic by optical computing, the performance of a complex-valued neural network implemented on an ONC proposed in embodiments of the invention in the following ways are compared.

• (a) both magnitude and phase information are encoded and detected, i.e. completely complex;

• (b) only the magnitude information is encoded, but both magnitude and phase information are detected, i.e. real encoding;

• (c) both the magnitude and phase information are encoded, but only the magnitude information is detected, i.e. real detection;

• (d) only magnitude information is encoded and detected, i.e. real encoding and detection, and

• (e) completely real-valued network.

[00118] In the first four scenarios, elements of the weight matrix W of the hidden layer are all parameterized by complex values. Fig. 7C is a plot showing the performances of a complex-valued neural network implemented on an ONC in the five different scenarios mentioned above according to the third embodiment of the invention. As shown in Fig. 7C, even when both the encoding and detection are restricted to be real-valued, the complex-valued architecture exhibits an advantaged performance of 87.7% over the performance of its real-valued counterpart of 84.3%. In other words, the complex-valued algorithm, even when both the encoding and detection are implemented in real values, outperforms the real-valued neural network. Confusion matrices and chip configurations are further described in supplementary information 8, as shown in Fig. 25 and Table S3. Therefore, the complex-valued network outperforms its real-valued counterpart regardless of the encoding and detection method. Moreover, the real detection has a better performance than the real encoding, indicating that the complex encoding contributes more to network performance, which is in agreement with the fact that complex-valued networks are more informative and have a larger capacity given the same network size. The capacity of a neural network may be defined by the number of effective real-valued parameters which is a function of the number of neurons N. Table 1 below sets out the performance of a single-layered network with N = 4 and N = 8. As shown in Table 1, a 4 X 4 complex- valued neural network with a capacity of 32, which has a performance of 93.1 % in terms of accuracy, beats an 8 X 8 real-valued neural network with a capacity of 64, which has a performance of 92.3% in terms of accuracy, despite its smaller capacity. Therefore, to achieve comparable performance, the complex-valued algorithm requires a smaller chip size and thus much fewer inner components, for example in this embodiment, 12 phase shifters of a 4-mode complex-valued chip are used in the complex-valued algorithm, while 56 phase shifters of an 8-mode real-valued chip are used in the real-valued algorithm.

Table 1

[00119] The costs to optically implement the complex-valued and real-valued neural networks are analyzed from three aspects: the input encoding, the weight multiplication, and the detection methods. Real encoding also requires the control of the relative phases between MZIs, otherwise, the output would be fluctuating with accumulated phase variability. Therefore, the effective information encoding onto phase components is more cost-effective since the inputs are more informative in two dimensions. The optical implementation of a real-valued matrix on a linear optical circuit is the same as that of a complex-valued matrix, which requires the modulation of all the inner phase shifters and outer phase shifters. Phase-sensitive detection schemes used in complex-valued algorithms are more expensive than intensity detection. Even though only real-valued detection is performed, the complex-value-encoded algorithms maintain an accuracy of 91.1% as shown in Fig. 7C, which is still significantly higher than that of the real-valued algorithm, i.e. 84.3%. The compromise made by replacing phase-sensitive detection with intensity detection can be interpreted as a choice of the activation function in neural network architectures. On conventional electronic computers, weight matrices are usually trained at least by a dimension of 64x64 or even 128x128, which is hard to achieve by optical circuits nowadays, but achievable with the developments of industry foundry. The embodiments of the invention highlight that fewer layers and neurons are required on complex -valued optical neuron networks, which means a small-scale optical neuron network could achieve comparable or better performance than its electronic counterparts. Further, a complex- valued algorithm expands the representation space to a high dimension, which scales up the capability of optical neuron networks without adding complexity to the hardware. [00120] In view of the three tasks described above, the complex-valued ONC proposed in embodiments of the invention has been benchmarked in multiple practice settings including (a) realization of elementary logic gates in the single neuron setting, (b) classification of Iris subspecies by a single-layer network and (c) handwriting recognition using a multilayer perceptron (MLP) network. The performance of the complex-valued ONC has been compared to a similar on-chip implementation using real-valued perceptrons. In all cases, the complex-valued ONC proposed in embodiments of the invention demonstrates a remarkable performance. In the elementary gate realization, the realization of several logic gates has been illustrated including a nonlinear XOR gate by a single complex-valued neuron - a task impossible by a single real-valued neuron. In Iris classification, an accuracy of up to 99.3% is obtained by a complex-valued layer, comparing favourably to 97.3% obtained by a real-valued layer. In handwriting recognition, a training accuracy of 90.6% is achieved when using a 4 X 4 hidden layer, an 8.5% improvement over the real-valued counterpart. Moreover, the performance gap persists when encoding and decoding modules are in intensity only - indicating that the performance advantage can be gained in using the proposed phase-sensitive ONC even for all real-valued interfaces. The experimental results present a promising avenue towards realizing deep complex-valued neural networks with dedicated integrated optical computing chips, and potential implementations of high dimensional quantum neural networks.

Experimental set-up [00121] In some embodiments of the invention, the experimental set-up below was used. The light source was a 1550-nm laser with 12dBm power from a Santee TSL-510 tunable laser. A polarization controller was applied to maximize the coupling of the light source to the ONC. A Peltier controlled by Thorlabs TED200C was used to stabilize the temperature of the chip and reduce the heat fluctuations caused by ambient temperature and the heat crosstalk within the chip. The data acquisition module includes a gainable transimpedance amplifier (TIA) and an Analog-to-Digital convertor NI-9215 with a resolution of 16 bit. The performing circuit which provides the electrical power to phase shifters has a 16-bit output precision.

Chip characterization

[00122] The I-V characteristics of each heater were calibrated. The relationship between electrical power and current were fitted by a non -resistive model

The characterization of each phase shifter was done by varying the

applied current while measuring the optical power at the output port. The collected measurement data were fitted with

where y was the optical power, d was a constant background, a was the maximum magnitude of the signal, b, c were coefficients depicting the relationship between the phase and the electrical power P computed by the non-resistive model. An average R-square value of 0.99 was achieved with the fittings, which indicated that the model adequately reproduced the data observed from the measurements. The average visibility was 99.85%.

Coherent detection

[00123] The signal and reference lights were injected into an MZI with an inner phase θ. For the intensity detection, the MZI was configured to a state of full transmission with θ = π. For the phase detection, the angle θ was set to be π/2, and the signal was interfered with the reference light. By connecting the two photodetectors in a balanced way, the output current was expressed as

, where A_s, A_;were the magnitudes of the signal light and the reference light. Similarly, by adding a phase shift of π/2 to the reference light, the output current was . Therefore, the Φ _s was

determined.

Numerical simulation [00124] The numerical simulation was conducted in Python. The single complex- valued neuron is built based on an open source code. The complex-valued layer and MLP were built in Tensorflow and trained by RMSPropOptimizer. In MLP, the weight matrix of a hidden layer was trained with unitary constraints for simplifying chip implementation, as a singular- value-decomposition (SVD) method should be adopted to realize arbitrary complex-valued weight connections.

Data availability

[00125] The data that support the findings of this study are available from the corresponding authors on reasonable request.

Supplementary Information

[00126] To further explain the details and advantages of the apparatus and method provided in various embodiments of the invention, the following supplementary information 1-12 are provided.

Supplementary Information 1: Phase shifter characteristics

[00127] In some embodiments of the invention, the MZIs may be modulated by thermal- optical effect. The thermal may be provided by integrated heaters on the ONC, e.g. TiN heaters. Fig. 8 is a schematic diagram showing a cross-section of a designed waveguide according to one embodiment of the invention. In this embodiment, the waveguide is 450x220 nm2, the TiN heater has a length of 100 μm, a width of 3 pm, and a thickness of 120 nm. The distance between the TiN heater and the top of the waveguide is 2 μm. The thermo-optic coefficient of silicon is From the experimental

results, the chip-integrated heater is a slightly non-resistive load as observed from the I-V characteristics shown in Fig. 9 A and 9B. As shown in Fig. 9 A, the chip-integrated heater is not an ideal resistive load as observed from the I-V characteristics. It can be equalized to a resistor with 342.8 Ω. Fig. 9B is a plot showing the polynomial fittings of the relationship between electrical power and applied current on the non-resistive heater as shown in Fig. 8. As noticed, the coefficient of the cubic term pi 0.003692, which is negligible.

[00128] The power consumption of the TiN heater is summarized as follows: theoretically, the voltage and current required to drive the heater for a 2π phase shift are 5 V and 15mA respectively. From the experimental results, the equivalent resistance of each heater is about 342.8W, and thus the electrical resistivity is 1.25 x 10-6 Ω-m. The average electrical power required for a 2π phase shift is 70mW, i.e., 14.1mA. Notably in actual experiments, not all phase shifters are working at a full load condition. For NxN weight connections, i.e., N neurons, the number of required phase shifters is Nx(N-1). Once the chip is trained, the electrical power applied on phase shifters remains as standing costs. The power consumption may be further reduced by adopting phase change materials or ultra- low power MEMS phase shifters. The lower power consumption of the ONC proposed in embodiments of the invention is reflected by that the complex-valued algorithm requires fewer neurons and layers, thus less computational resources and power consumptions, compared to the real-valued algorithms.

[00129] The modulation frequency of thermal-optical phase shifters may reach up to 10kHz. Once the network is completed, the weight matrix is fixed, which means that it is not required to modulate the phase shifters anymore. The demonstrated chip is indeed faster compared to an electronic computer, as discussed from two working conditions. The first is the static condition in which the machine learning model is trained and implemented onto the chip. Thus, the chip becomes passive and application-specified with all components remain static. The computation speed will not be limited by the modulator rate. And the most time-consuming part of the multiply-accumulate operations is accomplished on the ONC at light speed. The second is the dynamic condition in which the modulators are working. Although the modulation rate for thermal-optical modulators is 10kHz only, for carrier injection modulators the typical rate can reach tens of GHz. For a complex-valued neural network with a dimension of NxNxL, by matrix representations, 4N² multiply-accumulate operations are required for each layer. Therefore, with carrier- injection modulators that have a modulation rate of 10GHz, the ONC can perform 4N²XLX10¹⁰ MAC/S. The chip footprints decide that the light propagation is within a single second. For a single layer ONC with N=100, 10¹⁴ MAC/s is achieved, which can be benchmarked against the conventional CPU ( 10¹¹ FLOPS) and advanced GPU (10¹² FLOPS). PLOPS stands for floating point operations per second.

[00130] In this embodiment, isolation trenches may be adopted to improve the heating efficiency and reduce the crosstalk between adjacent heaters. The investigation upon how the trenches help reduce thermal crosstalk is as shown in Fig. 10. Fig. 10 shows plots showing the comparison between phase shifters with and without isolation trench according to the embodiment shown in Fig. 8. The calibrated crosstalk factor k is decreasing with the increase of the distance between the neighbor heater, i.e. the n-heater, and the calibration heater, i.e., the c-heater. As shown in Fig. 10, when the distance between the n-heater and the c-heater is the least, the calibrated crosstalk factor k decreases from 3.38 rad/W to 0.46 rad/W when isolation trenches are provided.

[00131] The calibration is done as follows. First, the heater in calibration is denoted as c-heater and one of its neighbor heaters is n-heater. It is assumed that the heat from the n- heater will add onto the c-heater, resulting in its calibration curve left-shifted. Imagine that no electrical power is supplied onto the heater, its original phase shift is induced by the crosstalk. By increasing the electrical power applied on the n-heater, a calibration curve of the effect of crosstalk on the c-heater can be obtained.

[00132] The electrical circuits incorporating with the ONC are shown in Figs. 11A-11B. Fig.ll A shows a schematic diagram illustrating the main components in the experimental setup according to one embodiment of the invention. Fig.1 IB is a raw picture showing the central part of the ONC with deep isolation trenches according to the embodiment in Fig. 11A. Fig.11C is a photograph showing the chip testing bed in an experiment according to this embodiment. Fig. 11D and 11E show SEM pictures of a grating coupler and an MMI respectively according to this embodiment. The heaters may be supplied by a multi- channel current source. The maximum current of each channel is 24 mA with a resolution of 16-bit, meaning that the minimum step is 370 nA. With all channels in operation, the maximum current from each channel is 15mA. In data acquisition, the transimpedance Amplifiers (TIA) for amplifying the current signal from the photodetector (PD) offer a tunable gain from 1.5x10⁴ to 1.5x10⁶ V/A. The digital-to-analog converter (DAC) in use has a detection range of ±10V and a sampling rate of 100 Ks/s/ch simultaneously at 16-bit resolution. Peltier Temperature Controllers (TEC) are used to facilitate heat dissipation and maintain a constant temperature.

Supplementary information 2: Chip characterization and loss analysis

[00133] The calibration may be done by applying electrical power to the phase shifter while measuring the optical power output at the corresponding optical port. The collected data is fitted according to Y = — a · cos(b · (P + c) + d, where d is a constant background, a is the maximum amplitude of the signal, b, c are coefficients depicting the relationship between phase and supplied power P. The average R² values obtained is 0.99, the best to reach is 1, which indicates that the model adequately reproduces the data observed from the measurements. The average visibility is 99.85%. The extinction ratio of the MZI is about 27dB. Fig. 12 shows plots illustrating fitting curves of the calibration of several exemplary phase shifters according to some embodiments of the invention, several exemplary heaters. The fitting results are shown in Table SI. The fitting parameters are reported with a 95% confidence level. By the characterization, how many electrical powers are required for reconfiguring the phase shifters to realize the designed phases can be determined.

[00134] The optical losses of the silicon photonic chip proposed in embodiments of the invention are also investigated. The main contributors to the optical loss are the coupling loss and the component loss. The losses are calibrated before the experiments. Grating couplers are used to guide light into the ONC, and the total coupling loss, e.g. the loss at input and output port, is -11.6 dB at 1550 nm. The standard propagation loss of the waveguide with 450 X 220 nm² is 2 dB/cm. The length of the proposed ONC structures is about 8 mm and thus the estimated propagation loss is 1.76 dB. The component loss mainly comes from the MMI which is 0.2 dB each. For an N-mode interferometer, the component loss is 0.2N for each optical path. Current fabrication technologies support a 100-mode interferometer.

Table SI Fitting results of the characterizations of exemplary heaters

Supplementary information 3: Input preparation

[00135] The following description details will be based on two cases, the binary classification of a toy dataset the Moon, and a 4-input logical XOR. The decomposition and implementation procedure of the input preparation matrix is as follows.

[00136] STEP 1 : Normalize the input. As input preparation is based on the proportion of different paths in the light beam, firstly all the combinations of the inputs are normalized. For example, to generate the inputs (x_1; x₂), x₁ ∈ [—1,1] and x₂ ∈ [—1,1], the bias which is constant 1 is appended into the inputs first and then the inputs are normalized to

[00137] STEP 2: Decompose the normalized proportion to phase angles on each phase shifter. Fig.l3A is a schematic diagram illustrating the arrangement of MZIs in an ONC according to one embodiment of the invention. The MZIs are arranged to form different function units including the signal preparation unit, weighting unit, and coherent detection unit. Each MZI has two working conditions, on and off. MZIs in black are on and those in gray are off. A complex-valued matrix is decomposed by W=UåV. In this embodiment, three inputs x₁, x₂, x₃ are prepared by MZIs T₃, T_4, and T₅. When measuring the input preparation results, the successor MZIs are configured to be an identity matrix. Suppose the input is The MZIs in use include T₅, T₄ and T₃. The transfer function of

each MZI is as shown in the following equation:

. , .

[00138] As the input is , the analytical output got by T₅ is A

global compensation phase Φ_c is temporarily introduced for easier computation and by solving ie are obtained. Thus, the

input of T₄ from down port is ie then by solving

are obtained. The down port input

of

can be obtianed.

[00139] STEP 3: Implement the phase angles onto the chip. For a binary input with bias , the MZIs in control include T₅ , T₄ and T₃ with (θ₅, Φ₅) =

(1.9106,0.7854), (θ₄, Φ₄) = (-1.5708,4.7124), θ₃ = -3.1416. The global phase will not affect the chip performance. By configuring the subsequent MZIs to form an identity matrix, the goodness of the prepared input signals can be measured. For better showing, the input (x₁, x₂), x₁ ∈ [—1,1] and x₂ ∈ [—1,1] may be scanned by step 0.1 as shown in Fig. 13B. In this embodiment, x₃ is the constant bias set as 1. In this embodiment, 441(21²) sets of input signals are prepared, which will be used for the classification of the dataset half-moon. The x-axes of the plots in Fig. 13B (a)-(c) show the input index, while the y- axes show the magnitude of output signals from the three ports, i.e. |x₁|, |x₂|, and |x₃|, respectively. Fig. 13B (d) shows the zoom-in figure of the part of the plot in Fig. 13B (a) which is highlighted in a back circle. From the measurement results shown in Fig. 13B, the standard deviation s and the limit of detection 3σ = 0.02 can be obtained.

[00140] In output acquisition, each data point is averaged over 50 samplings. The theoretical resolution of phase modulation is determined by the resolution of the current source. With a minimum increment of 370 nA, the theoretical resolution of phase modulation is 4 X 10^-9 rad based on the calibration curve. However, the actual resolution of phase modulation is limited by the optical detection method. In our case, the phase shift is detected by measuring the intensity of the interfering light signals, hence the resolution of the measured light intensity (r) would limit the resolution of phase modulation (x) according to sin(x) = r. Using the small-angle approximation of sin(x) = x for small x, the phase modulation resolution r = x. Given that the resolution of intensity detection is 0.02, the resolution of phase modulation is approximately 0.02 rad.

[00141] The limited resolution of intensity detection and phase detection means that if an event lies within 0.02 of the boundaries established by the hyperplane separating outputs “1” and “0”, it is possible that misclassification will be obtained. For example, in the classification of the Iris, samples located near the boundary are likely to be misclassified due to the experiment error. Logic XOR is a more specific case as, although the boundary is formed by the real and imaginary axis, the target destination of the four binary input is accurately the places, which is far from the boundary and allow for a

resolution of 0.02. Thus 100% classification accuracy is achieved.

Supplementary information 4: Decomposition and implementation of a complex weight matrix

[00142] The decomposition of weight matrices is a similar procedure to signal/input preparation. Here an explicit example of decomposing and implementing the weight matrices for classifying the Moon dataset and the 4-input XOR is provided. The weight matrices for classifying the dataset Moon may be denoted as follows:

[00143] STEP1: Normalization of weight matrices. The weight matrices implemented onto the chip should satisfy that its norm is no larger than 1. It is easy to understand as the chip is passive and the optical power can only decay and cannot increase. The bias of subsequent layers should be scaled according to the normalization factors of the previous layer. By including bias into the weight matrices and normalization, the following weight matrices are obtained:

[00144] STEP2: Singular vector decomposition. As the transfer matrix of a linear optical circuit is unitary, the arbitrary complex-valued matrix needs to be decomposed into a product of a unitary matrix, a diagonal matrix, and another unitary matrix like W = UDV. Correspondingly, different areas on the ONC are arranged to realize different matrices as shown in Fig. 13 A. For the weight matrix Wi, the process is executed according to the equation (SI):

[00145] The same process is executed for W2.

[00146] STEP3: Decomposition into phase angles. The decomposition scheme is as shown in Fig. 14 by taking a 6-mode structure as an example. In each subfigure, the black MZIs are being determined. And the “X” in matrices denotes the elements correspondingly being determined. The decomposition starts from the rightmost column of MZIs. It is supposed that the light is injected from the first input port, then the outputs correspond to the 6^th column in the weight matrix. In this way, T₁, T₆, T₁₀, T₁₃, T_15, and M₁ = T₁T₆T₁₀T₁₃T₁₅ can be determined. By U= M₁U₁ and U₁=M_1- ¹U, the next column T₂, T₇, T₁₁, T₁₄ can be determined, so on and so forth the phases to reconfigure a designed unitary matrix are obtained. Compensation phases are placed before the matrix input and can be implemented to the outer phase shifter of the last stage.

[00147] STEP4: Chip implementation and validation. Fig. 15 A and Fig. 15B are plots illustrating chip outputs by intensity measurements, i.e., the results of implementing input preparation and weight matrices corresponding to two respective layers, according to one embodiment of the invention. In a classification of this dataset half-moon, two output ports are monitored with y1 denoting the up port and y2 denoting the bottom port from T15 in Fig. 14. The light-grey box represents the noise floor of the output signals as theoretically the lowest output signal should be zero. The light-grey part is the zero bias of the detector during intensity detection. The decision surfaces from the second-layer outputs are shown in Fig. 15C and Fig. 15D. The chip validation results are in accordance with the ideal simulation results. In the machine learning task setting, category “0” is decided by |y₁| > |y₂| and the category “1” is decided by |y₁| < |y₂| . Thus the decision boundary is computed by solving the equation |y₁| = |y₂|. Fig. 15C is the dataset distribution and theoretical decision surface as the training results obtained in this embodiment. Fig. 15D is acquired by enumerating the (x_1; x₂), x₁ ∈ [—1,1] and x₂ ∈ [—1,1] by step 0.1. The resolution of the decision surface depends on the step size. The black rims are the theoretical output, in comparison to the experimentally acquired decision surfaces.

Supplementary information 5: On-chip Phase-diversity homodyne detection

[00148] Phase-diversity homodyne detection removes the common DC components and determines the cosine and sine of an angle to tell what exactly the angle is in [0, 2π ]. The fundamental concept is to take the product of electric fields of the modulated signal light and the reference light. The principle of phase-diversity homodyne detection is as shown in Fig. 16A. A commercial 90° optical hybrid circuit may be used to realize the phase- diversity homodyne. Here the realization of the phase-diversity optical homodyning on- chip is highlighted. Since the input signals and the reference signal are divided from the same light source, they have the same frequency and the same polarisation. Suppose the output of the neural network is E_s and the reference signal is E_l,

where A_s(t) and A_l(t) are their amplitudes, w_s and W_| are the optical frequencies and Φ_s,l the phases. The coherent detection may be realized by an MZI. According to the transfer function, the output field is as shown in equation (S4).

The outputs from the up port and the bottom port are as shown in Equation (S5) and (S6) respectively,

[00149] The photocurrent is proportional to the square of the input optical signal. When θ = π /2, the intensity of the two outputs are as follows:

Thus, the cosine component of the relative phase ΔΦ = Φ _s — Φ _l is obtained. To determine the ΔΦ ∈ [0,2π ], an additional π /2 phase shift is added in the reference light to achieve another two outputs,

The corresponding photocurrents are

By applying a balanced detector between /₁ and /₂, /₃ and /₄, the common ground can be removed,

[00150] The relative phase, i.e. the phase difference, can be retrieved by the detection results, and the magnitude is achieved by direct detection of the light intensity. The on- chip coherent detection may be realized by an MZI as shown in Fig. 16B. Fig. 16B is a schematic diagram illustrating the process of an on-chip homodyne using an MZI and the balanced detector with photodiodes PD 1 and PD2. A transimpedance amplifier (TIA) and an operational amplifier (OPAMP) are used for signal amplification. Fig. 16C are plots showing exemplary results of the coherent detection by varying the signal phase, corresponding to equations (S13) and (S14). For each phase, with its cos and sin values acquired, the phase can be uniquely determined. With the on-chip detection, the fluctuating phase caused by fiber components may be avoided and the stability and credibility of the detection may be significantly increased.

Supplementary information 6: Multi-layered optical neural network

[00151] In embodiments of the invention, the nonlinearity functions are achieved by converting optical signals to electronic signals first, applying pointwise activations, and converting the electrical signals back to optical signals. The selection of the nonlinear activation functions is dependent on the measurement methods. Intensity -based activations such as M(z) = ||z|| require for intensity detection. Other functions such as the hyperbolic tangent and Rectifier linear unit (ReLU) variations require phase-sensitive detection. [00152] Although the method proposed in embodiments of the invention is based on optical-electrical conversion, it may be multi-layered by adopting a feedback-clocked strategy. Two circuit arrangements may be used for the feedback-clocked strategy. One is the feedforwarding way and the other one is the recurrent way. Fig. 17 A is a schematic diagram illustrating a general structure of a multi-layered neural network. Referring to Fig. 17A, a nonlinear activation function/ is applied to the end of each output signal being transformed by a weight matrix W. And finally, a read-out layer is applied to reshape the output signals to expected shapes, such as the drop operation. Two strategies may be used for realizing a multi-layered neural network. Fig. 17B and Fig. 17C are schematic diagrams illustrating multi-layered neural networks realized by a feedforwarding way and recurrent way respectively. In a feedforwarding way, as shown in Fig. 17B, all layers are tiled on the optical neural chip and one additional column of MZIs is appended to the end of each layer. One of the two output ports of each appended MZI is used for inline monitoring, and the other for guiding the light to the next layer. The phase shifters on the appended MZIs are used for nonlinear activation functions, either intensity-detection-based or phase-detection- based. In a recurrent way, as shown in Fig. 17C, the optical neural chip includes only a single layer. For implementing normal multilayer perceptrons, the single layer is reused each time with different weight matrices. The nonlinear activation is applied to the electrical signals achieved by detecting the output signal and the input signals are then prepared according to the activated signals. In other words, each time the input of the next layer is modulated according to the activated output of the last layer. If the detection method is phase-sensitive, the input modulation will be applied on both magnitude and phase.

[00153] It is explained below how the intensity -based nonlinearity activation function works in complex architecture and what the differences are compared to real architecture. Assume the simplest setting, in which a single layer consists of two neurons, the output , the real-valued weight matrix , the

complex-valued weight matrix is the

bias which can either be real- valued or complex-valued. As known, the decision boundary of a binary classification depends on the equation y₁ = y₂. Therefore, for a real-valued network with W_R and intensity-based activation, the decision boundary is as shown in equation (S15).

The decision boundary results in two straight lines

. While under complex-valued architecture the nonlinearity can be shaped by equation (S16),

where k The decision boundary is a binary quadratic equation about the two

variables x_± and x₂,

[00154] The intensity-based activation function in real representations approximates a nonlinear shape by doubled line segments while in complex representations form a nonlinear hyperplane. The relationships that require approximation by various line segments in real networks can be accurately modeled by complex networks with fewer neurons and shallower layers.

Supplementary information 7: Logic XOR problem and Generalized XOR problem

[00155] The weight matrices used for the logical XOR task have been described in the first embodiment mentioned above. The complex- valued neuron proposed in embodiments of the invention has been applied on four toy logic tasks, namely the AND, OR, XOR and NAND task. The results of the AND and OR tasks are as shown in Figs. 18A and 18B respectively. In this embodiment, 10 iterations are conducted and recorded for each logic gate. Quadrants represent logical “0” are marked in grey and those represent logical “1” are marked in white as shown in Figs. 18A and 18B. Being processed by a complex-valued neuron, each of the four different possible combinations of logical inputs converges from a random starting point to the correct end point, via a continuous attenuation of magnitude and phase rotation. Table S2 sets out the truth table of logic gates according to this embodiment.

Table. S2 Truth table of logic gates

[00156] The loss convergence of the single complex-neuron on logic gate tasks is as shown in Figs. 19A-19D. The loss convergence is shown respectively by the real part and imaginary part of the complex-valued loss. As shown in Figs. 19A-19D, XOR is the only symmetry gate, which got two “0” and two “1” and the four input combinations converge to separate pre-defined ending points (1+1j, 1-1j,-1+1j,-1-1j), thus the loss is converged to zero from both real and imaginary part. For the other three tasks, their convergence has been observed, but the converged position is slightly different from the pre-defined ending points. As in logic gate tasks, the inputs are four binary combinations, the same complex- valued neuron is also used for general XOR problems with multiple input samples. A generalized XOR problem is defined with multiple random 2-dimensional input samples drawn from Gaussian distribution. If the two components of an input x 1 and x2 are of the same sign, the neuron output y is targeted as 0, otherwise, it is 1. In the settings of binary logic gates according to some embodiments of the invention, the mapping from outputs to quadrants are predefined so that the data points would fall into the first and third quadrants with a logic “0” and the second and fourth quadrants with a logic “1”. The data samples and the decision regions, i.e. the decision surface, formed by a trained complex-valued neuron are as shown in Fig. 20A. Those of a single real-valued neuron is also displayed in Fig. 20B for comparison. A generalized XOR problem is defined with multiple random 2- dimensional input samples drawn from Gaussian distribution. If the two components of an input x1 and x2 are of the same sign, the neuron output y is targeted as 0, otherwise, it is 1. A complex neuron is not restricted to the only linear pattern as a real neuron is. Supplementary information 8: Training, convergence, and decision surfaces of the Iris task [00157] Fig. 21 is a plot showing the training curves, i.e. classification accuracy against training iterations, of the Iris classification task when the proposed complex-valued model is used according to the second embodiment of the invention and when the real-valued model is used. As shown in Fig. 21, the complex- valued model achieves a classification accuracy of 99.3%, while the real-valued model achieves an accuracy of 97.3%. The trained weight vectors used in chip implementation are as follows:

[00158] Specifically, comparisons are made upon the decision boundaries of the real- valued layer and complex-valued layer. Fig. 22 are diagrams showing the decision boundaries of a real-valued layer for the Iris classification according to the second embodiment of the invention. The decision boundaries of the real-valued layer are straight lines as shown in Fig. 22. Each diagram shows the three species being classified based on two of the four features, thus a total of six combinations are shown in Fig. 22. The accuracy achieved on the entire dataset is 97.3%. By real-valued neuron, all the decision boundaries are straight lines as conformed to equation (S 15). Fig. 23 are diagrams showing the decision boundaries of a complex -valued layer with intensity detection for the Iris classification according to the second embodiment of the invention. In complex layers, nonlinearities are observed in the decision surfaces, as shown in Fig. 23 The accuracy achieved on the entire dataset is 99.3%. By complex-valued neurons, the decision boundaries are nonlinear shaped as conformed to equations (S16) and (S17). As noticed that the real-world dataset Iris is mostly separable by straight lines, the improvement from complex-valued neurons is not significant. Some toy datasets that are inseparable by real-valued neurons, the decision boundary of which is straight lines, are further explored and it has been found that these tasks can be perfectly solved by complex-valued neurons. The results of the two datasets, the circles, and the half-moons, are as shown in Figs. 24A and 24B respectively. The confusion matrix of complex- valued MLP with the real interface is as shown in Fig. 25. The error rate of each digit can be derived. The x-axis shows the expected label of the testing samples and the y-axis shows the predicted results. If all the predictions are conforming to the expectations, the confusion matrix should have numbers only in the diagonal positions. From the figure, the error rate of each digit can be determined. For example, for the digit 9, the most errors happen in classifying 9 to be 5 as there are 120 wrong predictions and the error rate is 19.33%. A set of on-chip configurations of complex - valued neural networks with real encoding and real detection and weight matrices trained as unitary is shown in Table. S3.

Table. S3 The experimental configurations of classification of MNSIT

Supplementary information 10: Differentiation among encoding methods [00159] Some possible methods for input encoding in the experiment are set out below.

The analysis helps understand the differences between areal-valued neural networks and complex-valued neural networks from the perspective of input encoding. Even optical implementations of real-valued neural networks adopt intensity detection only, and the weight matrices are set to be arbitrary real matrices, the phase of the input light should be strictly controlled by maintaining the relative phase between paths as zero. Take a 2-node hidden layer for illustration, suppose the light waves come from the same source with the same frequency, the input, and the weight matrix

[00160] The time-varying components are ignored as the propagation paths control the same. The output from the up port and the corresponding

photocurrent is as shown in equation (S19).

[00161] For real-valued implementations, if no control is done on the input light, the fluctuations in the phase will make the weight matrices meaningless. That is why even for a real-valued demonstration, a coherent light source and on-chip light division should be adopted. For complex-valued neural networks, the input can take the same method as a real one, namely a fixed phase difference of zero and modulated amplitudes. Or the output of the input-to-hidden layer may be directly encoded on the chip. The input will be encoded with informative initial magnitude and phase.

Supplementary information 11: Network capacity

[00162] The capacity of a network may be defined to be the number of real-valued parameters which have been taken up. Conventional computer represents a complex number x + iy using real numbers (x, y) and doubles the number of parameters of each layer: In conventional computers, high capacity means more memory to be

taken and more computation to be dealt with. However, in a coherent optical implementation, the chip requirements of a real-valued layer are the same as that of a complex-valued layer with the same layer size. Besides, as a n X n complex layer is equivalent to a

real layer with respect to network capacity, n(n — 1) variables are taken in a complex chip while

is taken in a real one. The difference between the variables

grows quadratically with the layer size n. The difference between the occupied optical chip size of the equivalent real and complex networks indicates that the complex- valued architectures are more practicable with the increase of the size of the hidden layer.

Supplementary information 12: Pointwise nonlinear activation functions in complex-valued networks [00163] The choice of a non-linearity activation function is important in the formulation of a neural network. Several activation functions have been proposed for dealing with complex-valued neural networks. In optical implementations, these functions are related to the measurement methods. Identity or no activation function does not require measurement at all and can be used to identify the tasks which are not linearly separable in the real domain but separable in the complex domain. Magnitude-based activations require for intensity detection. Other functions require phase-sensitive detection such as the hyperbolic tangent 1 and Rectifier linear unit (ReLU) variations. The Relu variations include ModReLU, CReLu, and ZReLU. Among these, the ModRelu is commonly used and chosen in the experiments according to some embodiments of the invention.

[00164] In conclusion, as known by a person skilled in the art, complex-valued neural networks host a number of performance advantages but were burdened by the heavy computational cost of complex multiplication in a conventional computer, various embodiments of the invention, for the first time, provide an apparatus and method for implementing a complex-valued neural network. The apparatus may include an ONC as described in various embodiments of the invention. Embodiments of the invention have demonstrated the implementation of truly complex-valued neural networks on a single ONC, where complex multiplication may be realized passively by optical interference. The resulting ONCs have significant performance advantages over real-valued counterparts in a range of tasks at both the single-neuron and the network level. Notably, a single complex- valued neuron can solve certain nonlinear tasks while its real-valued counterpart is unable to do so. Moreover, complex-valued networks on the proposed ONCs demonstrate marked improvements in classification and handwriting recognition tasks. Thus, the potential of complex-valued optical neural networks has been illustrated to feature versatile representations, easy optimization, and rapid learning. The proposed ONC is not only designed for complex-valued neural networks but is also readily applicable to classical real-valued neural network algorithms. Meanwhile, the small chip size, low cost, high computational speed, and low power consumption make it practical to implement large- scale optical deep learning algorithms on the proposed ONC.

[00165] The ONC proposed in various embodiments of the invention also provides a natural pathway towards near-term quantum computation. Notably, the perceptrons here are realized by networks of optical interferometers. Such networks, when coupled with non-classical light, can enable classically intractable sampling tasks. Indeed, there has been a number of recent proposals in generalising neural networks to the quantum domain. The platform proposed by embodiments of the invention, with the incorporation of non- classical light sources, thus provides a promising avenue for their realization.

[00166] It is to be understood that the embodiments and features described above should be considered exemplary and not restrictive, various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. The scope of the invention should, therefore, be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled. Furthermore, certain terminology has been used for the purposes of descriptive clarity, and not to limit the disclosed embodiments of the invention.

Claims

1. An apparatus for implementing a complex-valued neural network, the apparatus comprising a plurality of Mach-Zehnder interferometers (MZIs) integrated onto a chip, wherein the plurality of MZIs are arranged to form a signal preparation unit configured to divide an input light from a laser source into a plurality of light beams, and modulate an amplitude and/or a phase of each light beam to generate a plurality of input signals and a reference signal; a weighting unit configured to adjust a configuration of the MZIs which are used to form the weighting unit according to a weight value or a weight matrix to transform the generated input signals into at least one output signal; and a coherent detection unit configured to interfere each of the at least one output signal with the reference signal to obtain information encoded in the magnitude and/or the phase of the at least one output signal.

2. The apparatus according to claim 1, wherein the signal preparation unit is further configured to modulate both the amplitude and the phase of each light beam to generate a plurality of complex-valued input signals.

3. The apparatus according to claim 1, wherein the signal preparation unit is further configured to modulate the amplitude of each light beam and set a phase difference between two adjacent light beams to zero to generate a plurality of real-valued input signals.

4. The apparatus according to any preceding claim, wherein the weighting unit is further configured to adjust the configuration of the MZIs which are used to form the weighting unit based on a feedback signal determined by a processor based on the information obtained by the coherent detection unit from the at least one output signal.

5. The apparatus according to any preceding claim, wherein each MZI comprises a first beam splitter (BS)-phase shifter (PS) pair comprising a first BS and a first tunable PS, and a second BS-PS pair comprising a second BS and a second tunable PS, wherein the first BS is configured to divide a received light beam into a first sub-beam and a second sub-beam with a phase difference of π/2; the first tunable PS is configured to add a first phase change to a path of the first sub-beam; the second BS is configured to combine two received light beams; and the second tunable PS is configured to add a second phase change to a path of the combined light beam from the second BS.

6. The apparatus according to claim 5, wherein each of the first BS and the second BS has a transmissivity of 50:50.

7. The apparatus according to claim 6, wherein each of the first BS and the second BS is a multimode interferometer (MMI).

8. The apparatus according to any one of claim 5 to claim 7, further comprising a plurality of components integrated onto the chip, wherein each of the plurality of components is coupled to the first PS or the second PS of an MZI on the chip to tune a phase of a light beam by the first PS or the second PS coupled thereto.

9. The apparatus according to claim 8, wherein the plurality of components include a plurality of titanium nitride (TiN) heaters.

10. The apparatus according to claim 9, further comprising a plurality of isolation trenches formed between two adjacent TiN heaters on the chip.

11. The apparatus according to claim 8, wherein the plurality of components include a plurality of PIN modulators.

12. The apparatus according to claim 10, wherein the plurality of PIN modulators include a plurality of injection modulators and/or carrier depletion modulators.

13. The apparatus according to any one of claim 5 to claim 12, wherein the weighting unit is further configured to adjust a configuration of the MZIs which are used to form the weighting unit by adjusting the first tunable PS and/or the second tunable PS of each of the MZIs which are used to form the weighting unit.

14. The apparatus according to any preceding claim, further comprising at least one balanced detector integrated onto the chip, wherein each of the at least one balanced detector is configured to detect an intensity of an interfering light signal generated by the coherent detection unit.

15. The apparatus according to any preceding claim, wherein when the apparatus is used to perform a logic gate with a single complex-valued neuron, the signal preparation unit comprises three MZIs configured to divide the input light into four light beams and modulate the magnitude of each light beam to generate three input signals and the reference signal, wherein the three input signals comprise two logic inputs and one bias.

16. The apparatus according to claim 15, wherein the logic gate is an AND, OR, NAND or XOR gate.

17. The apparatus according to any one of claim 1 to claim 14, wherein when the apparatus is used to perform a classification of dataset Iris, the signal preparation unit comprises five MZIs configured to divide the input light into six light beams and modulate the magnitude and/or the phase of each light beam to generate five input signals and the reference signal, wherein the five input signals comprise four inputs and one bias; wherein the weighting unit comprises 12 MZIs which are adjusted according to a 4 by 3 weight matrix to transform the generated input signals into three output signals; and the apparatus comprises three coherent detection units.

18. The apparatus according to claim 17, wherein the apparatus comprises a single layer complex-valued neural network which is trained using any one of the following parameters: three phase parameters {φ1, <φ2,3 }, or three magnitude parameters {ρ1,ρ2,ρ3}, or a combination of phase and magnitude parameters { ρ1 , ρ2,φ 1 ),(ρ2,<φ2) }.

19. The apparatus according to any one of claim 1 to claim 14, wherein when the apparatus is used to perform a handwriting recognition task, the apparatus comprises a multi-layer perceptron formed by a plurality of single -layered complex-valued neural networks arranged in a cascading way or recurrent way.

20. A method for implementing a complex-valued neural network, the method comprising: providing an apparatus for implementing the complex-valued neural network, wherein the apparatus comprises a plurality of MZIs integrated onto a chip, wherein the plurality of MZIs are arranged to form a signal preparation unit, a reference signal generation unit, a weighting unit and a coherent detection unit, wherein the method further comprises: dividing, by the signal preparation unit, an input light from a laser source into a plurality of light beams, and modulating an amplitude and/or a phase of each light beam to generate a plurality of input signals and a reference signal; adjusting, by the weighting unit, a configuration of the MZIs which are used to form the weighting unit according to a weight value or a weight matrix to transform the input signals into at least one output signal; and interfering, by the coherent detection unit, each of the at least one output signal with the reference signal to obtain information encoded in the magnitude and/or the phase of the at least one output signal.

21. The method according to claim 20, wherein the modulating an amplitude and/or a phase of each light beam to generate a plurality of input signals, comprises: modulating both the amplitude and the phase of each light beam to generate a plurality of complex- valued input signals.

22. The method according to claim 20, wherein the modulating an amplitude and/or a phase of each light beam to generate a plurality of input signals, comprises: modulating the amplitude of each light beam, and setting a phase difference between two adjacent light beams to zero to generate a plurality of real-valued input signals.

23. The method according to any one of claim 20 to claim 22, further comprising: adjusting, by the weighting unit, the configuration of the MZIs which are used to form the weighting unit based on a feedback signal determined by a processor based on the information obtained by the coherent detection unit from the at least one output signal.

24. The method according to any one of claim 20 to claim 23, wherein each MZI comprises a first beam splitter (BS)-phase shifter (PS) pair comprising a first BS and a first tunable PS, and a second BS-PS pair comprising a second BS and a second tunable PS, wherein the first BS is configured to divide a received light beam into a first sub-beam and a second sub-beam with a phase difference of π/2; the first tunable PS is configured to add a first phase change to a path of the first sub-beam; the second BS is configured to combine two received light beams; and the second tunable PS is configured to add a second phase change to a path of the combined light beam from the second BS.

25. The method according to claim 24, wherein each of the first BS and the second BS has a transmissivity of 50:50.

26. The method according to claim 25, wherein each of the first BS and the second BS is a multimode interferometer (MMI).

27. The method according to any one of claim 24 to claim 26, wherein the providing the apparatus further comprises: providing the apparatus which further comprises a plurality of components integrated onto the chip, wherein each heater is coupled to the first PS or the second PS of an MZI on the chip to tune a phase of a light beam by the first PS or the second PS coupled thereto.

28. The method according to claim 27, wherein the plurality of components include a plurality of TiN heaters.

29. The method according to claim 28, wherein the providing the apparatus further comprises: providing the apparatus which further comprises a plurality of isolation trenches formed between two adjacent TiN heaters on the chip.

30. The method according to claim 27, wherein the plurality of components include a plurality of PIN modulators.

31. The method according to claim 30, wherein the plurality of PIN modulators include a plurality of injection modulators and/or carrier depletion modulators.

32. The method according to any one of claim 24 to claim 31, wherein the adjusting the configuration of the MZIs which are used to form the weighting unit further comprises: adjusting the first tunable PS and/or the second tunable PS of each of the MZIs which are used to form the weighting unit.

33. The method according to any one of claim 20 to claim 32, wherein the providing the apparatus further comprises: providing the apparatus which further comprises at least one balanced detector integrated onto the chip, wherein the method further comprises: detecting, by each of the at least one balanced detector, an intensity of an interfering light signal generated by the coherent detection unit.

34. The method according to any one of claim 20 to claim 33, wherein when the apparatus is used to perform a logic gate with a single complex-valued neuron, the signal preparation unit comprises three MZIs, the dividing the input light from a laser source into a plurality of light beams comprises: dividing the input light into four light beams, and modulating the magnitude of each light beam to generate three input signals and the reference signal, wherein the three input signals comprise two logic inputs and one bias.

35. The method according to claim 34, wherein the logic gate is an AND, OR, NAND or XOR gate.

36. The method according to any one of claim 20 to claim 33, wherein when the apparatus is used to perform a classification of dataset Iris, the signal preparation unit comprises three MZIs, the dividing the input light from a laser source into a plurality of light beams comprises: dividing the input light into six light beams, and modulating the magnitude and/or the phase of each light beam to generate five input signals and the reference signal, wherein the five input signals comprise four inputs and one bias; the adjusting the configuration of the MZIs comprises: adjusting, by the weighting unit, the configuration of 12 MZIs which are used to form the weighting unit according to a 4 by 3 weight matrix to transform the generated five input signals into three output signals; and the interfering one of the at least one output signal with the reference signal comprises: interfering, by each of three coherent detection units, one of the three output signals with the reference signal to obtain information encoded in the magnitude and/or phase of the output signal.

37. The method according to claim 36, wherein the providing the apparatus further comprises: providing the apparatus which comprises a single layer complex-valued neural network trained using any one of the following parameters: three phase parameters {φ1, φ2, φ3}, or three magnitude parameters {ρ1, ρ2, ρ3), or a combination of phase and magnitude parameters {ρ1,(ρ2,<φ1),(ρ2,<φ2)}.

38. The method according to any one of claim 20 to claim 33, wherein when the apparatus is used to perform a handwriting recognition task, the providing the apparatus further comprises: providing the apparatus which comprises a multi-layer perceptron formed by a plurality of single-layered complex-valued neural networks arranged in a cascading way or a recurrent way.