WO2023201970A1 - 一种计算芯片、系统及数据处理方法 - Google Patents
一种计算芯片、系统及数据处理方法 Download PDFInfo
- Publication number
- WO2023201970A1 WO2023201970A1 PCT/CN2022/118103 CN2022118103W WO2023201970A1 WO 2023201970 A1 WO2023201970 A1 WO 2023201970A1 CN 2022118103 W CN2022118103 W CN 2022118103W WO 2023201970 A1 WO2023201970 A1 WO 2023201970A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- optical
- signal
- matrix
- computing chip
- target
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 12
- 230000003287 optical effect Effects 0.000 claims abstract description 144
- 238000004364 calculation method Methods 0.000 claims abstract description 35
- 238000006243 chemical reaction Methods 0.000 claims abstract description 20
- 239000011159 matrix material Substances 0.000 claims description 103
- 238000000034 method Methods 0.000 claims description 21
- 230000004913 activation Effects 0.000 claims description 15
- 238000000354 decomposition reaction Methods 0.000 claims description 9
- 238000005516 engineering process Methods 0.000 claims description 6
- 239000013307 optical fiber Substances 0.000 claims description 5
- 239000006096 absorbing agent Substances 0.000 claims description 4
- 230000005540 biological transmission Effects 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000009977 dual effect Effects 0.000 claims 2
- 230000006870 function Effects 0.000 description 21
- 239000013598 vector Substances 0.000 description 18
- 230000008569 process Effects 0.000 description 11
- 238000010586 diagram Methods 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 5
- 210000002569 neuron Anatomy 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000004377 microelectronic Methods 0.000 description 3
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 229920006395 saturated elastomer Polymers 0.000 description 2
- 229910052710 silicon Inorganic materials 0.000 description 2
- 239000010703 silicon Substances 0.000 description 2
- 230000005374 Kerr effect Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 230000008021 deposition Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000002207 thermal evaporation Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B10/00—Transmission systems employing electromagnetic waves other than radio-waves, e.g. infrared, visible or ultraviolet light, or employing corpuscular radiation, e.g. quantum communication
- H04B10/50—Transmitters
- H04B10/516—Details of coding or modulation
- H04B10/54—Intensity modulation
- H04B10/541—Digital intensity or amplitude modulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/78—Architectures of general purpose stored program computers comprising a single central processing unit
- G06F15/7807—System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
- G06F15/7817—Specially adapted for signal processing, e.g. Harvard architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B10/00—Transmission systems employing electromagnetic waves other than radio-waves, e.g. infrared, visible or ultraviolet light, or employing corpuscular radiation, e.g. quantum communication
- H04B10/50—Transmitters
- H04B10/501—Structural aspects
- H04B10/503—Laser transmitters
- H04B10/505—Laser transmitters using external modulation
- H04B10/5051—Laser transmitters using external modulation using a series, i.e. cascade, combination of modulators
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B10/00—Transmission systems employing electromagnetic waves other than radio-waves, e.g. infrared, visible or ultraviolet light, or employing corpuscular radiation, e.g. quantum communication
- H04B10/50—Transmitters
- H04B10/501—Structural aspects
- H04B10/503—Laser transmitters
- H04B10/505—Laser transmitters using external modulation
- H04B10/5053—Laser transmitters using external modulation using a parallel, i.e. shunt, combination of modulators
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B10/00—Transmission systems employing electromagnetic waves other than radio-waves, e.g. infrared, visible or ultraviolet light, or employing corpuscular radiation, e.g. quantum communication
- H04B10/50—Transmitters
- H04B10/516—Details of coding or modulation
Definitions
- the present application relates to the field of computer technology, and in particular to a computing chip, system and data processing method.
- a computing chip including:
- Signal transmitter used to transmit laser signals
- the electro-optical modulator array is used to convert the laser signal into a target optical signal under the control of the electrical domain controller in the computing chip; the target optical signal is used to represent the input data of the target AI model;
- the photodetector array is used to perform photoelectric conversion on the light calculation results to obtain the model processing results of the target AI model based on the input data.
- the electrical domain controller includes:
- a digital-to-analog conversion module used to convert the first electrical signal into a first analog signal
- the logic control circuit is used to transmit the first analog signal to the electro-optical modulator array, so that the electro-optical modulator array converts the laser signal into the target optical signal according to the first analog signal.
- the electro-optical modulator array is specifically used for:
- the light intensity of the laser signal is modulated according to the first analog signal to obtain the target light signal.
- the digital-to-analog conversion module is also used to convert the second electrical signal into a second analog signal
- the logic control circuit is also used to transmit the second analog signal to the programmable optical structure, so that the programmable optical structure adjusts the phase shifter in itself according to the second analog signal to realize the model weight matrix of the target AI model.
- the electrical domain controller further includes:
- the driving module is used to drive the logic control circuit to transmit the second analog signal to the programmable optical structure, so that the programmable optical structure adjusts the phase shifter in itself according to the second analog signal.
- the electrical domain controller further includes:
- a storage module configured to store model processing results, the first electrical signal, the second electrical signal, and/or the output results of the nonlinear activation function.
- the signal transmitter includes:
- An optical fiber array connected to the laser is used to transmit the laser signal to the electro-optical modulator array in a preset number of input paths.
- it also includes:
- Saturable absorber structure bistable structure or MZI structure used to realize nonlinear activation function.
- the programmable optical structure includes: a cascaded MZI structure and a parallel optical attenuator structure.
- a second aspect of this application provides a computing system, including: a plurality of computing chips of any of the above items, each computing chip being connected using optical interconnection technology.
- the third aspect of this application provides a data processing method applied to any of the above computing chips, including:
- the electrical domain controller is used to control the electro-optical modulator array to convert the laser signal into a target optical signal; the target optical signal is used to represent the input data of the target AI model;
- the programmable optical structure uses the programmable optical structure to calculate the target light signal and output the light calculation results; the programmable optical structure implements the model weight matrix of the target AI model;
- the photodetector array is used to perform photoelectric conversion on the light calculation results to obtain the model processing results of the target AI model based on the input data.
- the present application provides a computing chip, which includes: a signal transmitter for transmitting a laser signal; an electro-optical modulator array for converting the laser signal under the control of the electric domain controller in the computing chip is the target light signal; the target light signal is used to represent the input data of the target AI model; the programmable optical structure that implements the model weight matrix of the target AI model is used to calculate the target light signal and output the light calculation results; and the photodetector The array is used to perform photoelectric conversion on the light calculation results to obtain the model processing results of the target AI model based on the input data.
- Figure 1 is a schematic structural diagram of a computing chip provided in one or more embodiments of the present application.
- Figure 2 is a schematic diagram of a programmable optical structure provided in one or more embodiments of the present application.
- Figure 3 is a schematic structural diagram of a single MZI provided in one or more embodiments of the present application.
- Figure 4 is a schematic diagram of the topology corresponding to the cascaded MZI structure in one or more embodiments of Figure 2;
- Figure 5 is a schematic structural diagram of another computing chip provided in one or more embodiments of the present application.
- Figure 6 is a schematic structural diagram of a single optical attenuator provided in one or more embodiments of the present application.
- Figure 7 is a schematic diagram of a weight matrix between two layers of networks provided in one or more embodiments of the present application.
- Figure 8 is a flow chart of a data processing method provided in one or more embodiments of the present application.
- logic circuits such as FPGA can be used to accelerate the processing of specific operations such as convolution.
- electronic chips such as FPGA and GPU will be affected by Moore's Law and their computing power cannot continue to grow. Therefore, their computing power is limited and they are prone to crosstalk and high power. problems such as power consumption, high latency, and thermal deposition.
- this application provides a computing chip, system and data processing method to use optical structures to process complex operations, thereby improving the hardware's processing capabilities for complex operations.
- an embodiment of the present application discloses a computing chip, which includes: a signal transmitter, an electro-optical modulator array, a programmable optical structure, a photodetector array, and an electrical domain controller.
- the programmable optical structure implements the model weight matrix of the target AI model.
- the model weight matrix of the target AI model is obtained by training with software algorithms. That is to say: after using a software algorithm to train an AI model, the corresponding model weight matrix can be determined based on various parameters of the AI model, and then the corresponding programmable optical structure can be built according to the model weight matrix, so that the programmable optical structure It can have the same functions as the AI model implemented by the algorithm. Subsequent use of this programmable optical structure can replace the software algorithm AI model for calculation, thereby improving the model calculation speed and accelerating processing efficiency. For example: If the AI model is an image classification model, then the programmable optical structure used to implement the weight matrix of the model can also perform image classification and finally output the image category. Correspondingly, the target light signal input to the programmable optical structure represents a certain image data to be classified.
- the signal transmitter is used to transmit laser signals.
- the electro-optical modulator array is used to convert the laser signal into a target optical signal under the control of the electrical domain controller in the computing chip; the target optical signal is used to represent the input data of the target AI model.
- the programmable optical structure is used to calculate the target optical signal and output the optical calculation results.
- the photodetector array is used to perform photoelectric conversion on the light calculation results to obtain the model processing results of the target AI model based on the input data.
- the programmable optical structure includes: a cascaded MZI structure and a parallel optical attenuator structure.
- the programmable optical structure can be shown in Figure 2.
- Figure 2 represents a 6-input, 6-output programmable optical structure, where " ⁇ " represents a phase shifter.
- the first half of Figure 2 shows the cascaded MZI structure, and the second half shows the parallel optical attenuator structure.
- an MZI structure is shown in Figure 3.
- an MZI structure includes: two directional couplers: B1 and B2, an internal phase shifter R ⁇ and an external phase shifter
- the directional coupler is a 4-port device with 2 inputs and 2 outputs. It can couple the optical power of the input port to the output port in a 50:50 split ratio.
- the internal phase shifter 2 ⁇ (0 ⁇ /2) is responsible for modulating the MZI output power.
- external phase shifter responsible for compensating the relative phase of the two lights output by the MZI, so both phase shifters have programmable functions.
- An MZI structure corresponds to a 2 ⁇ 2-dimensional unitary matrix.
- the cascaded MZI structure in the first half of Figure 2 corresponds to a 6-dimensional unitary matrix.
- the specific topology corresponding to the 6-dimensional unitary matrix can be referred to Figure 4.
- a " ⁇ " represents an MZI structure with the input and output terminals reversely connected. 1 to 6 are 6 input signals.
- the "xy” at each " ⁇ " represents the two channels of the MZI structure there. input signal.
- “65” means that the two input signals of the MZI structure here are 6 and 5.
- the computing chip in this embodiment uses a programmable optical structure to calculate and process the target optical signal representing the input data of the AI model, and can quickly obtain the model processing results of the AI model for the input data, thus improving the hardware's processing capabilities for complex operations. .
- the electrical domain controller in the computing chip can control the electro-optical modulator array to convert the laser signal emitted by the signal transmitter, thereby converting the ordinary laser signal emitted by the signal transmitter into the target light representing the input data of the AI model. signal, and then the programmable optical structure quickly calculates the target optical signal, and the corresponding optical calculation result can be obtained.
- the optical calculation results are photoelectrically converted using a photodetector array, so that the electrical signal representation corresponding to the optical calculation results can be obtained.
- this computing chip It has the characteristics of low power consumption, high throughput and low latency. It should be noted that the photon computing chip is a non-von Neumann architecture, can perform calculations at the speed of light, and has higher computing power than electronic AI chips.
- the composition structure of the electrical domain controller can be referred to FIG. 5 .
- the electrical domain controller includes: digital-to-analog conversion module, logic control circuit, storage module SRAM (Static Random-Access Memory), and driver module.
- the digital-to-analog conversion module is used to convert the first electrical signal into a first analog signal; the logic control circuit is used to transmit the first analog signal to the electro-optical modulator array, so that the electro-optical modulator array converts the laser signal according to the first analog signal. Convert to target light signal.
- the first electrical signal is specifically: an instruction capable of converting the laser signal into a target optical signal.
- the electro-optical modulator array is specifically used to modulate the light intensity of the laser signal according to the first analog signal to obtain the target optical signal. It can be seen that the electro-optical modulator array can modulate the light intensity of the laser signal.
- the digital-to-analog conversion module is also used to convert the second electrical signal into a second analog signal; the logic control circuit is also used to transmit the second analog signal to the programmable optical structure, so that the programmable optical structure follows
- the second analog signal adjusts the phase shifter in itself to implement the model weight matrix of the target AI model.
- the second electrical signal is specifically: the model weight matrix of the target AI model.
- the driving module in the electrical domain controller is used to drive the logic control circuit to transmit the second analog signal to the programmable optical structure, so that the programmable optical structure adjusts the phase shifter in itself according to the second analog signal. It can be seen that the driving module can control each phase shifter in the programmable optical structure.
- the refractive index of the material can be changed by modulating the voltage to achieve changes in the phase value of the phase shifter.
- the voltage can also be adjusted to change the physical distance between the interference arms, thereby changing the phase value of the phase shifter.
- the model processing result of the target AI model output by the photodetector array against the input data may not be the final result.
- the programmable optical structure may only implement the weight matrix of a certain layer of the model, and the model processing results of the target AI model based on the input data cannot be calculated at one time, the programmable optical structure can be used for repeated calculations. It can be seen that the programmable optical structure can also only implement the weight matrix of a certain layer of the AI model. There are corresponding weight matrices between layers of the AI model, so the target AI model may have multiple model weight matrices. If the programmable optical structure is used to implement all the weight matrices of the model, then the programmable optical structure can be used.
- the processing result of the first output can be temporarily stored in the storage module in the electrical domain controller for subsequent access and recalculation.
- the storage module in the electrical domain controller is used to store the model processing result, the first electrical signal, the second electrical signal and/or the output result of the nonlinear activation function.
- the programmable optical structure calculates linear operations, so the nonlinear activation function involved in the target AI model can be calculated using software, and then the electrical domain controller is used to obtain the results calculated by the software.
- the composition structure of the signal transmitter can be referred to FIG. 5 .
- the signal transmitter includes: a laser and an optical fiber array connected to the laser.
- a laser is used to generate a laser signal.
- the fiber optic array is used to transmit the laser signal to the electro-optical modulator array in a preset number of input paths. That is to say, there are a corresponding number of input paths in the optical fiber array, which can divide the laser signal generated by the laser into several optical signals and transmit them to the electro-optical modulator array.
- one input path corresponds to one electro-optical modulator, so multiple optical input paths correspond to one electro-optical modulator array.
- each electro-optical modulator can adjust the intensity of the optical signal of its corresponding path.
- the photodetector array includes a plurality of photodetectors, and one photodetector is used to convert the optical signal of its corresponding path.
- an embodiment of the present application discloses another design architecture of a computing chip.
- the processor core of the computing chip is divided into two parts: the electrical domain and the optical domain.
- the electrical domain is a CMOS microelectronic chip, including logic control module, storage module, digital-to-analog conversion module and driver module.
- the digital-to-analog conversion module is used for D/A conversion or A/D conversion.
- the optical domain is a silicon optical chip that integrates optical waveguides and optical modulators.
- the electrical domain part and the optical domain part are packaged using a flip-chip process and are connected correspondingly through a bump array.
- the programmable optical matrix is a cascaded optical modulation array (ie, cascaded MZI structure), which can perform linear multiplication of a two-dimensional weight matrix and a one-dimensional input vector.
- the electronic chip converts the optimized model weight matrix (i.e., the second electrical signal) into a voltage signal (i.e., the second analog signal) through D/A, and uses the voltage signal to drive the optical modulation array to perform processing on the laser signal in the waveguide.
- Intensity modulation that is, adjusting the phase shifter in the cascaded MZI structure to modulate the phase intensity, so that the optical modulation array can realize the model weight matrix W.
- the model weight matrix W is obtained by training the model using computer software.
- the cross-entropy loss function can be used to calculate the deviation between the network output and the actual value (label), and then the error backpropagation algorithm is used to iteratively optimize the difference, and the weight matrices of the network are obtained through training.
- the above-trained weight matrix is loaded onto the optical modulation array of the computing chip in the form of an analog signal, so that the optical modulation array can be used for model inference applications.
- the integrated silicon optical chip is connected to the only peripheral laser light source through a coupling optical fiber array.
- the laser continuously outputs continuous laser signals to the chip.
- the microelectronic chip converts the preset input data (i.e., the first electrical signal) into a voltage signal (i.e., the first analog signal) through the D/A conversion module, and uses the voltage signal to drive the electro-optical modulator array to weaken the incident light intensity, thereby converting the multi-channel
- the incident light signal is encoded as a one-dimensional input column vector x (used to represent the model input data). This step moves from the digital domain to the analog domain.
- the optical signal via the electro-optical modulator array is input into the programmable optical matrix, and the programmable optical matrix calculates and outputs a one-dimensional vector result.
- the result is received by the photodetector array and converted into a multi-channel current signal.
- the current signal is converted into a voltage signal through a transimpedance amplifier, converted into a digital signal through A/D and saved to the microelectronic chip. This step returns from the analog domain to the digital domain.
- a represents the output of the programmable optical matrix.
- the nonlinear activation function can be implemented using materials and structures that meet the activation function conditions, such as saturated absorbers, bistable states, and the Kerr effect of MZI.
- the nonlinear activation function function can be implemented in the electrical domain using software algorithms.
- an optical structure can also be used to implement the nonlinear activation function, such as using a saturated absorber structure, a bistable structure, or an MZI structure to implement the nonlinear activation function.
- the computing chip shown in Figure 5 can implement matrix-vector linear multiplication operations. Specifically, the computing chip encodes data by modulating the amplitude or phase of the laser pulse, and the data is a continuous real number.
- the computing chip can also use the traditional fully connected neural network architecture.
- the network principle also includes three parts: an input layer, several hidden layers and an output layer.
- each layer of the network contains several neuron nodes, and the neuron nodes in each layer are connected through weight matrices to perform linear matrix multiplication operations.
- Neuron node values are delinearized using a nonlinear activation function before being input to the next layer.
- the computing chip provided in this embodiment can use optical structures to implement linear and nonlinear computing functions such as neuron node calculation, weighting, and activation in the software sense.
- the programmable optical matrix is a key component of the computing chip.
- the following introduces the implementation principle of programmable optical matrix.
- U(m) is an m ⁇ m dimensional unitary matrix
- ⁇ is an m ⁇ n dimensional diagonal matrix with non-negative real numbers on the diagonal
- VT(n) is an n ⁇ n dimensional unitary matrix, which is the Conjugate transpose.
- the model weight matrix can be decomposed into the product of two unitary matrices and a diagonal matrix, and then optical devices are used to implement two unitary matrices and a diagonal matrix respectively, and a programmable optical structure that implements the model weight matrix can be obtained. If the model weight matrix is decomposed into the product of U, ⁇ and VT, then the input optical signal passes through the optical structures corresponding to VT(n), ⁇ , and U(m) respectively, and the corresponding model processing results can be obtained.
- any m-dimensional unitary matrix it can be implemented by cascading a single MZI structure.
- Tqp p and q represent the optical matrix input port numbers of the two input ports entering the MZI structure (refer to Figure 4), 0 ⁇ p ⁇ q ⁇ m.
- T61 also represents the expansion matrix of this MZI structure.
- Tqp is transformed from the identity matrix, but its p-th row and p-column elements are replaced with u11, p-th row and q-column elements are replaced with u12, q-th row and p-column elements are replaced with u21, and q-th row q Column elements are replaced with u22.
- the remaining diagonal elements are all 1, and the off-diagonal elements are all 0. That is to say, when the Tqp matrix participates in the operation, only the signals entering the port corresponding to the MZI structure participate in the change. The other signals are not involved, and the value corresponding to Tqp is a diagonal matrix.
- the corresponding Tqp can be obtained by controlling the value of the phase shifter in each MZI structure.
- the dimensionality of the m-dimensional unitary matrix can be reduced to m-1 by right multiplying Tm(m-1), Tm(m-2),..., Tm2, Tm1, which satisfies:
- the diagonal matrix ⁇ only needs to control each diagonal element, so it can be implemented using an optical attenuator based on the MZI structure.
- m parallel-connected optical attenuators can realize programming of an m-dimensional diagonal matrix.
- the structure of a single optical attenuator is shown in Figure 6. As shown in Figure 6, the input and output of the lower path are blocked. When the input light intensity is E, the output light intensity attenuates to Ecos2 ⁇ .
- a 6-input and 6-output programmable optical structure as shown in Figure 2 can be realized. If the simplest 2-layer fully connected neural network is constructed to realize the classification and recognition function, then the entire network includes an input layer, a hidden layer and an output layer. The input layer is a feature vector extracted from the target to be classified, containing 6 elements. The two weight matrices between each layer are shown in Figure 7. Figure 7 only illustrates the two weight matrices and does not draw each layer.
- each weight matrix W is decomposed into two 6-dimensional unitary matrices and a 6-dimensional diagonal matrix ⁇ through singular value decomposition.
- Unitary matrices and diagonal matrices are operated by photon computing chips.
- the topological structure of the 6-dimensional unitary matrix is shown in Figure 4.
- the input laser signal starts to propagate from T65T, and T65T is the transpose of T65, which is equivalent to the MZI reverse connection.
- the optical signal first passes through the external phase shifter.
- V1T(6), U1(6), V2T(6) and U2(6) can be realized in sequence.
- the nonlinear activation function f is calculated through software. Among them, f can be RelU, Sigmoid function, etc. For the classification and recognition network, it ultimately needs to be output through the normalized exponential function softmax.
- Figure 2 contains a calculation structure corresponding to a 6-dimensional unitary matrix (i.e., the cascaded MZI structure shown in Figure 2) and a calculation structure corresponding to a 6-dimensional diagonal matrix (i.e., the parallel optical attenuator structure shown in Figure 2).
- a 6-dimensional unitary matrix i.e., the cascaded MZI structure shown in Figure 2
- a calculation structure corresponding to a 6-dimensional diagonal matrix i.e., the parallel optical attenuator structure shown in Figure 2
- two photon computing chips are required.
- First calculation Edit the phase shifter to make the unitary matrix structure of the photon computing chip realize V1T(6), and the diagonal matrix structure realize ⁇ 1.
- the electro-optical modulator array encodes the laser signal of the input feature vector x to start the calculation.
- the photodetector array is used to collect the first calculation result and temporarily store it in the memory.
- the data enters the hidden layer from the input layer.
- the 2-layer fully connected neural network needs to complete the above operation again, perform weight matrix-vector linear multiplication operation W(2)z(1), and finally pass the photodetector and identify the channel with the maximum output power, that is, identify the object category. If quantitative analysis is required, the softmax function can be calculated in the software. At this time, the data enters the output layer from the hidden layer.
- the programmable optical structure in the photonic computing chip is suitable for operations between matrices and vectors.
- optical interconnection technology can also be used to operate multiple vectors in parallel in different photonic computing chips. Using the photon computing chip provided in this embodiment can improve the running speed of matrix multiplication operations in neural networks, and has the characteristics of low power consumption, high throughput, and low latency.
- This embodiment provides a computing system, including: a plurality of computing chips according to any of the above embodiments, and each computing chip is connected using optical interconnection technology.
- This embodiment provides a computing system that can use an optical structure to process complex operations, thereby improving the hardware's processing capabilities for complex operations.
- a data processing method provided by an embodiment of the present application is introduced below.
- the data processing method described below and the computing chip described above can be referred to each other.
- This embodiment provides a data processing method, which is applied to the computing chip of any of the above embodiments, including:
- the data processing method provided by the computing chip of this embodiment can use optical structures to process complex operations, thereby improving the hardware's processing capabilities for complex operations.
- RAM random access memory
- ROM read-only memory
- electrically programmable ROM electrically erasable programmable ROM
- registers hard disks, removable disks, CD-ROMs, or anywhere in the field of technology. any other known form of readable storage medium.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Electromagnetism (AREA)
- Computer Networks & Wireless Communication (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Optics & Photonics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Optical Modulation, Optical Deflection, Nonlinear Optics, Optical Demodulation, Optical Logic Elements (AREA)
Abstract
本申请公开了计算机技术领域内的一种计算芯片、系统及数据处理方法。该计算芯片中的电域控制器可以控制电光调制器阵列将信号发射器发射的激光信号转换为表示AI模型输入数据的目标光信号,之后可编程光学结构对目标光信号进行快速计算,得到相应光计算结果,进而使用光电探测器阵列对光计算结果进行光电转换,得到光计算结果对应的电信号表示,该电信号表示即AI模型针对输入数据的模型处理结果。
Description
相关申请的交叉引用
本申请要求于2022年4月21日提交中国专利局,申请号为202210417937.3,申请名称为“一种计算芯片、系统及数据处理方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及计算机技术领域,尤其涉及一种计算芯片、系统及数据处理方法。
目前,可以借助FPGA、GPU等逻辑电路加速卷积等特定运算的处理。然而,发明人意识到,FPGA、GPU等电子芯片会受摩尔定律影响,算力无法持续增长,因此其计算能力有限,还容易出现串扰、高功耗、高延时、热沉积等问题。
发明内容
本申请的一方面,提供了一种计算芯片,包括:
信号发射器,用于发射激光信号;
电光调制器阵列,用于在计算芯片中的电域控制器的控制下,将激光信号转换为目标光信号;目标光信号用于表示目标AI模型的输入数据;
实现目标AI模型的模型权重矩阵的可编程光学结构,用于对目标光信号进行计算,输出光计算结果;及
光电探测器阵列,用于对光计算结果进行光电转换,得到目标AI模型针对输入数据的模型处理结果。
在其中一个实施例中,电域控制器包括:
数模转换模块,用于将第一电信号转换为第一模拟信号;
逻辑控制电路,用于传输第一模拟信号至电光调制器阵列,以便电光调制器阵列按照第一模拟信号将激光信号转换为目标光信号。
在其中一个实施例中,电光调制器阵列具体用于:
按照第一模拟信号调制激光信号的光强度,得到目标光信号。
在其中一个实施例中,数模转换模块还用于将第二电信号转换为第二模拟信号;及
逻辑控制电路还用于传输第二模拟信号至可编程光学结构,以便可编程光学结构按照第二模拟信号调节自身中的移相器,实现目标AI模型的模型权重矩阵。
在其中一个实施例中,电域控制器还包括:
驱动模块,用于驱动逻辑控制电路传输第二模拟信号至可编程光学结构,以便可编程光学结构按照第二模拟信号调节自身中的移相器。
在其中一个实施例中,电域控制器还包括:
存储模块,用于存储模型处理结果、第一电信号、第二电信号、和/或非线性激活函数的输出结果。
在其中一个实施例中,信号发射器包括:
激光器,用于生成激光信号;及
与激光器连接的光纤阵列,用于将激光信号以预设数量的输入路径传输至电光调制器阵列。
在其中一个实施例中,还包括:
用于实现非线性激活函数的饱和吸收体结构、或双稳态结构或MZI结构。
在其中一个实施例中,可编程光学结构包括:级联MZI结构和并联光衰减器结构。
本申请的第二方面,提供了一种计算系统,包括:多个上述任一项的计算芯片,各计算芯片利用光互联技术连接。
本申请的第三方面,提供了一种数据处理方法,应用于上述任一项的计算芯片,包括:
利用信号发射器发射激光信号;
利用电域控制器控制电光调制器阵列将激光信号转换为目标光信号;目标光信号用于表示目标AI模型的输入数据;
利用可编程光学结构对目标光信号进行计算,输出光计算结果;可编程光学结构实现有目标AI模型的模型权重矩阵;及
利用光电探测器阵列对光计算结果进行光电转换,得到目标AI模型针对输入数据的模型处理结果。
通过以上方案可知,本申请提供了一种计算芯片,包括:信号发射器,用于发射激光信号;电光调制器阵列,用于在计算芯片中的电域控制器的控制下,将激光信号转换为目 标光信号;目标光信号用于表示目标AI模型的输入数据;实现目标AI模型的模型权重矩阵的可编程光学结构,用于对目标光信号进行计算,输出光计算结果;及光电探测器阵列,用于对光计算结果进行光电转换,得到目标AI模型针对输入数据的模型处理结果。
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1为本申请一个或多个实施例中提供的一种计算芯片结构示意图;
图2为本申请一个或多个实施例中提供的一种可编程光学结构示意图;
图3为本申请一个或多个实施例中提供的单个MZI结构示意图;
图4为图2中一个或多个实施例中级联MZI结构对应的拓扑结构示意图;
图5为本申请一个或多个实施例中提供的另一种计算芯片结构示意图;
图6为本申请一个或多个实施例中提供的单个光衰减器结构示意图;
图7为本申请一个或多个实施例中提供的两层网络间的权重矩阵示意图;
图8为本申请一个或多个实施例中提供的一种数据处理方法流程图。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
目前,可以借助FPGA等逻辑电路加速卷积等特定运算的处理,但是,FPGA、GPU等电子芯片会受摩尔定律影响,算力无法持续增长,因此其计算能力有限,还容易出现串扰、高功耗、高延时、热沉积等问题。
为此,本申请提供了一种计算芯片、系统及数据处理方法,以使用光学结构针对复杂运算进行处理,从而提升硬件针对复杂运算的处理能力。
参见图1所示,本申请实施例公开了一种计算芯片,包括:信号发射器、电光调制器阵列、可编程光学结构、光电探测器阵列、电域控制器。其中,可编程光学结构实现目标AI模型的模型权重矩阵。
其中,目标AI模型的模型权重矩阵由软件算法训练得到。也即:使用软件算法训练好某一AI模型后,基于该AI模型的各种参数即可确定相应的模型权重矩阵,然后按照模型权重矩阵搭建相应的可编程光学结构,使得该可编程光学结构可以和算法实现的AI模型具有相同功能。后续使用该可编程光学结构可以代替软件算法AI模型进行计算,从而提升模型计算速度,加速处理效率。例如:若AI模型是图像分类模型,那么用于实现该模型权重矩阵的可编程光学结构也就能进行图像分类,最终输出图像类别。相应地,输入该可编程光学结构的目标光信号即表示某一个待分类的图像数据。
具体的,信号发射器用于发射激光信号。电光调制器阵列用于在计算芯片中的电域控制器的控制下,将激光信号转换为目标光信号;目标光信号用于表示目标AI模型的输入数据。可编程光学结构用于对目标光信号进行计算,输出光计算结果。光电探测器阵列用于对光计算结果进行光电转换,得到目标AI模型针对输入数据的模型处理结果。
在一种具体实施方式中,可编程光学结构包括:级联MZI结构和并联光衰减器结构。可编程光学结构可以如图2所示,图2表示6输入、6输出的可编程光学结构,其中的“□”表示移相器。图2前半部分为级联MZI结构,后半部分为并联光衰减器结构。
其中,一个MZI结构如图3所示。如图3所示,一个MZI结构包括:两个定向耦合器:B1和B2,一个内移相器Rθ和一个外移相器
定向耦合器为2输入2输出的4端口器件,可以将输入端口光功率按50:50分光比耦合到输出端口。内移相器2θ(0≤θ≤π/2)负责调制MZI输出功率。外移相器
负责补偿MZI输出的两路光的相对相位,因此两个移相器均具有可编程功能。一个MZI结构对应一个2×2维酉矩阵。
据此对照图2可知,图2前半部分的级联MZI结构对应6维酉矩阵,该6维酉矩阵对应的具体拓扑结构可参照图4。在图4中,一个“●”代表一个输入端和输出端反接的MZI结构,1~6为6路输入信号,每个“●”处的“xy”表示该处的MZI结构的两路输入信号。例如:“65”表示该处的MZI结构的两路输入信号为6路和5路。
本实施例中的计算芯片使用可编程光学结构对表示AI模型输入数据的目标光信号进行计算和处理,可以快速得到AI模型针对该输入数据的模型处理结果,从而提升硬件针对复杂运算的处理能力。
具体的,该计算芯片中的电域控制器可以控制电光调制器阵列对信号发射器发射的激光信号进行转换,从而能够将信号发射器发射的普通激光信号转换为表示AI模型输入数据的目标光信号,之后可编程光学结构对目标光信号进行快速计算,可得到相应光计算结果。为了使光计算结果能够显示和应用,使用光电探测器阵列对光计算结果进行了光电转换,从而可得到光计算结果对应的电信号表示。
可见,由于光信号具有超高速、超大容量、高并行性、高抗干扰能力等优势,因此使用可编程光学结构加速AI模型的处理,可以提高AI模型中复杂运算的运行速度,因此该计算芯片具有低功耗、高通量、低延时的特点。需要说明的是,光子计算芯片为非冯诺依曼架构,可以以光速进行运算,相比电子类AI芯片拥有更高的算力。
基于上述实施例,电域控制器的组成结构可参照图5。参见图5所示,电域控制器包括:数模转换模块、逻辑控制电路、存储模块SRAM(Static Random-Access Memory,静态随机存取存储器)、驱动模块。
具体的,数模转换模块用于将第一电信号转换为第一模拟信号;逻辑控制电路用于传输第一模拟信号至电光调制器阵列,以便电光调制器阵列按照第一模拟信号将激光信号转换为目标光信号。其中,第一电信号具体为:能够将激光信号转换为目标光信号的指令。相应地,电光调制器阵列具体用于:按照第一模拟信号调制激光信号的光强度,得到目标光信号。可见,电光调制器阵列能够调制激光信号的光强度。
在一种具体实施方式中,数模转换模块还用于将第二电信号转换为第二模拟信号;逻辑控制电路还用于传输第二模拟信号至可编程光学结构,以便可编程光学结构按照第二模拟信号调节自身中的移相器,实现目标AI模型的模型权重矩阵。其中,第二电信号具体为:目标AI模型的模型权重矩阵。
具体的,电域控制器中的驱动模块,用于驱动逻辑控制电路传输第二模拟信号至可编程光学结构,以便可编程光学结构按照第二模拟信号调节自身中的移相器。可见,驱动模块能够控制可编程光学结构中的各移相器。一般地,可以通过调制电压来改变材料折射率,从而实现移相器相位值的变化,还可以通过调整电压来干涉臂物理间距,从而改变移相器的相位值。
需要说明的是,光电探测器阵列输出的目标AI模型针对输入数据的模型处理结果可能并非是最终结果。因为可编程光学结构可能仅实现了模型某一层的权重矩阵,无法一次就计算出目标AI模型针对输入数据的模型处理结果,那么可以使用可编程光学结构进行重复计算。可见,可编程光学结构也可以仅实现AI模型某一层的权重矩阵。而AI模型的 层与层之间都相应有权重矩阵,故目标AI模型的模型权重矩阵可能有多个,若使用可编程光学结构实现的是模型所有的权重矩阵,那么使用可编程光学结构可直接输出模型最终的输出结果。若使用可编程光学结构仅实现了模型某几层的权重矩阵,那么使用可编程光学结构输出的结果为相应的中间结果。如下述实施例的6输入6输出可编程光学结构,第一次输出的处理结果可暂存至电域控制器中的存储模块,以便后续取用进行再次计算。
相应地,电域控制器中的存储模块用于存储模型处理结果、第一电信号、第二电信号和/或非线性激活函数的输出结果。其中,可编程光学结构计算的是线性运算,因此目标AI模型涉及的非线性激活函数,可使用软件进行计算后,利用电域控制器取用软件计算的结果。
基于上述实施例,信号发射器的组成结构可参照图5。参见图5所示,信号发射器包括:激光器和与激光器连接的光纤阵列。具体的,激光器用于生成激光信号。光纤阵列用于将激光信号以预设数量的输入路径传输至电光调制器阵列。也就是说,光纤阵列中设有相应数量的输入路径,可以将激光器生成的激光信号分给几个光信号,并传输给电光调制器阵列。其中,一个输入路径对应一个电光调制器,因此多个光输入路径对应一个电光调制器阵列。相应地,每个电光调制器可以调整自己所对应路径的光信号的强度。相应地,光电探测器阵列包括多个光电探测器,一个光电探测器用于转换自己所对应路径的光信号。
参见图5所示,本申请实施例公开了另一种计算芯片的设计架构。如图5所示,计算芯片的处理器核心分为电域和光域两部分。其中,电域为CMOS微电子芯片,包括逻辑控制模块、存储模块、数模转换模块和驱动模块等。其中,数模转换模块用于进行D/A转换或A/D转换。光域为集成光波导和光学调制器的硅光芯片,主要承担矩阵及向量的线性乘法运算,包括电光调制器阵列、可编程光学矩阵(即可编程光学结构)和光电探测器阵列。电域部分和光域部分在封装上采用倒装工艺,通过凸点阵列对应连接。
可编程光学矩阵为级联的光学调制阵列(即级联MZI结构),可进行二维权重矩阵与一维输入向量的线性乘法运算。具体的,电子芯片将优化后的模型权重矩阵(即第二电信号)通过D/A转换为电压信号(即第二模拟信号),使用该电压信号驱动光学调制阵列对波导内的激光信号进行强度调制,即:调节级联MZI结构中的移相器,以调制相位强度,如此该光学调制阵列就能实现模型权重矩阵W。其中,模型权重矩阵W使用计算机软件训练模型得到。网络模型训练过程中,可以使用交叉熵损失函数计算网络输出与实际值(标签)的偏差,然后使用误差反向传播算法迭代优化该差值,训练得到网络各权重矩阵。上述训练好的权重矩阵以模拟信号的方式加载到计算芯片的光学调制阵列上,以便使用光学 调制阵列进行模型推理应用。
其中,集成硅光芯片通过耦合光纤阵列连接外围唯一激光器光源。计算芯片工作时,激光器为芯片持续输出连续的激光信号。
微电子芯片将预设输入数据(即第一电信号)经D/A转换模块转换为电压信号(即第一模拟信号),使用该电压信号驱动电光调制器阵列减弱入射光强度,将多路入射光信号编码为一维输入列向量x(用于表示模型输入数据)。此步由数字域进入模拟域。
经由电光调制器阵列的光信号输入可编程光学矩阵,由可编程光学矩阵计算输出一维向量结果,该结果被光电探测器阵列接收并转换为多路电流信号。该电流信号通过跨阻放大器转换为电压信号,并经过A/D转换为数字信号并保存至微电子芯片。此步由模拟域回到数字域。
按照上述流程能够以光速完成一次矩阵与向量x的线性乘法运算:a=W·x。a表示可编程光学矩阵的输出。在上述计算流程中,非线性激活函数可以利用饱和吸收体、双稳态和MZI的Kerr效应等能够满足激活函数条件的材料和结构实现。当然,为了简化光子计算芯片结构复杂度,可以将非线性激活函数功能在电域使用软件算法实现。在一种具体实施方式中,还可以使用光学结构实现非线性激活函数,如使用饱和吸收体结构、或双稳态结构或MZI结构实现非线性激活函数。
如图5所示的计算芯片,其可以实现矩阵—向量的线性乘法运算。具体的,计算芯片通过调制激光脉冲的振幅或相位来编码数据,数据为连续的实数。计算芯片还可以沿用传统的全连接神经网络架构,网络原理上也包括输入层、若干隐藏层和输出层三个部分。一般地,网络的每一层包含若干神经元节点,各层神经元节点之间通过权重矩阵相连进行线性的矩阵乘法运算。神经元节点数值在输入下一层之前使用非线性激活函数去线性。而本实施例提供的计算芯片可以使用光学结构实现上述软件意义上的神经元节点计算、加权、激活等线性和非线性运算功能。
由此可见,可编程光学矩阵为计算芯片的关键组成部分。下面介绍可编程光学矩阵的实现原理。
数学意义上,任意维度的二维实数矩阵W(m,n)均可以通过奇异值分解法分解为三个矩阵U、Σ和VT的积,即W=U(m)ΣVT(n)。其中,U(m)为m×m维酉矩阵,Σ为对角线为非负实数的m×n维对角矩阵,VT(n)为n×n维酉矩阵,是V(n)的共轭转置。因此可将模型权重矩阵拆解为两个酉矩阵和一个对角矩阵的乘积,然后使用光学器件分别实现两个酉矩阵和一个对角矩阵,就可以得到实现模型权重矩阵的可编程光学结构。如果模型权重矩阵拆 解为U、Σ和VT的积,那么作为输入的光信号经过VT(n)、Σ、U(m)分别对应的光学结构,即可得到相应的模型处理结果。
下面介绍如何通过调节MZI结构中的移相器来实现权重矩阵W。根据图3可知,当输入光信号从左向右传播时,输出列向量xout=UMZI·xin,那么单个MZI结构的传输矩阵可表示为:
对于任意m维酉矩阵,均可用单个MZI结构通过级联实现。在数学意义上,可以对m维酉矩阵在二维子空间内执行连续的降维变换,即先将m维酉矩阵降维至m-1维,然后不断重复该过程,最终降维至二维。由于单个MZI结构对应的矩阵运算为二维,因此在进行m维酉矩阵因子分解时,需要先将单个MZI结构对应的二维矩阵扩展为m维待求矩阵Tqp。
m维待求矩阵Tqp如下所示:
如Tqp所示,p和q代表进入MZI结构的两个输入端口的光学矩阵输入端口编号(参考图4),0≤p<q≤m。如图4所示,光学矩阵输入端口p=1和q=6的两路光信号汇聚到T61对应的某MZI结构中,T61也表示此MZI结构的扩展矩阵。
由Tqp可以看出,Tqp由单位矩阵变换而来,但其第p行p列元素替换为u11,第p行q列元素替换为u12,第q行p列元素替换为u21,第q行q列元素替换为u22。其余对角线元素均为1,非对角线元素均为0。也就是说,在Tqp矩阵参与运算时,只有进入对应MZI结构的端口的信号参与变化。其余路信号不参与,对应Tqp的值为对角矩阵。通过控制每个MZI结构中的移相器的值可以得到对应的Tqp。
对于任意m维酉矩阵,通过右乘Tm(m-1),Tm(m-2),…,Tm2,Tm1的方法可以将m维酉矩阵降维至m-1,满足:
设R(m)=Tm(m-1)Tm(m-2)…Tm2Tm1,则满足U(m)R(m)R(m-1)…R(2)=D,D为模为1的对角矩阵。此时m维酉矩阵可以表示为:U(m)=DRT(m)RT(m-1)…RT(2),据此可通过级联MZI结构实现m维酉矩阵。
相应地,n维酉矩阵可以表示为:VT(n)=DRT(n)RT(n-1)…RT(2),据此可通过级联MZI结构实现n维酉矩阵。
另外,对角矩阵Σ只需要对每个对角线元素进行控制,因此可以利用基于MZI结构的光衰减器来实现。m个并联的光衰减器可实现m维对角矩阵的编程。单个光衰减器的结构 如图6所示,如图6所示,下路的输入和输出被阻断。当输入光强为E时,输出光强衰减至Ecos2θ。
依据上述原理,可实现如图2所示的6输入6输出可编程光学结构。若构建最简单的2层全连接神经网络用于实现分类识别功能,那么整个网络包含一个输入层,一个隐藏层和一个输出层。输入层为从待分类目标内提取的特征向量,含6元素。各层之间的两个权重矩阵如图7所示,图7仅示意了两个权重矩阵,未画出各层。
由于输入层的特征向量含6元素,故需要两个6维权重矩阵W和两个非线性激活函数f,用于完成两次矩阵—向量线性乘法运算和两次非线性运算。其中,每个权重矩阵W通过奇异值分解为两个6维酉矩阵和一个6维对角矩阵Σ。酉矩阵和对角矩阵通过光子计算芯片运算。
目前酉矩阵分解法包括Reck三角分解和Clements矩形分解。本实施例以三角分解为例,6维酉矩阵可分解为:
具体的,6维酉矩阵的拓扑结构即如图4所示。输入激光信号从T65T开始传播,而T65T是T65的转置,相当于MZI反接,此时光信号先经过外移相器
通过加载不同的相位值,可以依次实现V1T(6),U1(6),V2T(6)和U2(6)。非线性激活函数f通过软件实现运算。其中,f可以为RelU、Sigmoid函数等。对于分类识别网络,最终还需要经过归一化指数函数softmax再输出。
为了简化光子计算芯片结构,在计算芯片中仅设置一个6维酉矩阵对应的计算结构,后续重复使用此结构同样能完成运算。图2包含一个6维酉矩阵对应的计算结构(即图2所示的级联MZI结构)和一个6维对角矩阵对应的计算结构(即图2所示的并联光衰减器结构)。按照图2所示结构,要完成一次矩阵—向量的线性乘法运算,需要经过两次光子计算芯片。
第一次计算:编辑移相器使光子计算芯片酉矩阵结构实现V1T(6),对角矩阵结构实现Σ1。其次,电光调制器阵列编码输入特征向量x的激光信号,开始计算。然后,使用光电探测器阵列采集第一次计算结果,暂存入内存中。
第二次计算:编辑移相器使光子计算芯片酉矩阵结构实现U1(6),对角矩阵结构实现单位对角矩阵。其次,将暂存数据加载到电光调制器端,准备第二次计算。然后,使用光 电探测器采集第二次计算结果,暂存入内存中,此时完成一次权重矩阵—输入特征向量线性乘法运算W(1)x。最后,在电子芯片端完成非线性运算z(1)=f(W(1)x),结果暂存入内存中。
此时,数据由输入层进入隐藏层。2层全连接神经网络需要再次完成上述操作,进行权重矩阵—向量线性乘法运算W(2)z(1),最终通过光电探测器并识别出最大输出功率的通道,即识别出物体类别。若需要定量分析,可在软件中计算softmax函数。此时数据由隐藏层进入到输出层。
可见,光子计算芯片中的可编程光学结构适用于矩阵与向量之间的运算。对于神经网络中常涉及的矩阵与矩阵之间的运算,可以先在电子芯片中将某矩阵拆分为多个向量,然后依次在光子计算芯片中与另一个矩阵运算,得到多个输出向量,最终在电子芯片中合成输出矩阵。当然,也可以利用光互连技术,将多个向量在不同光子计算芯片中并行运算。使用本实施例提供的光子计算芯片,可以提高神经网络中矩阵乘法运算的运行速度,具有低功耗、高通量、低延时的特点。
下面对本申请实施例提供的一种计算系统进行介绍,下文描述的一种计算系统与上文描述的一种计算芯片可以相互参照。
本实施例提供了一种计算系统,包括:多个上述任一实施例的计算芯片,各计算芯片利用光互联技术连接。
本实施例提供了一种计算系统,能够使用光学结构针对复杂运算进行处理,从而提升硬件针对复杂运算的处理能力。
下面对本申请实施例提供的一种数据处理方法进行介绍,下文描述的一种数据处理方法与上文描述的一种计算芯片可以相互参照。
请参见图8,本实施例提供了一种数据处理方法,应用于上述任一实施例的计算芯片,包括:
S801、利用信号发射器发射激光信号。
S802、利用电域控制器控制电光调制器阵列将激光信号转换为目标光信号;目标光信号用于表示目标AI模型的输入数据。
S803、利用可编程光学结构对目标光信号进行计算,输出光计算结果;可编程光学结构实现有目标AI模型的模型权重矩阵。
S804、利用光电探测器阵列对光计算结果进行光电转换,得到目标AI模型针对输入数据的模型处理结果。
其中,本实施例可以参考前述实施例中公开的相应内容,在此不再进行赘述。
本实施例计算芯片提供的数据处理方法,能够使用光学结构针对复杂运算进行处理,从而提升硬件针对复杂运算的处理能力。
本申请涉及的“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法或设备固有的其它步骤或单元。
需要说明的是,在本申请中涉及“第一”、“第二”等的描述仅用于描述目的,而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。另外,各个实施例之间的技术方案可以相互结合,但是必须是以本领域普通技术人员能够实现为基础,当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在,也不在本申请要求的保护范围之内。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的可读存储介质中。
本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。
Claims (20)
- 一种计算芯片,其特征在于,包括:信号发射器,用于发射激光信号;电光调制器阵列,用于在所述计算芯片中的电域控制器的控制下,将所述激光信号转换为目标光信号;所述目标光信号用于表示目标AI模型的输入数据;实现所述目标AI模型的模型权重矩阵的可编程光学结构,用于对所述目标光信号进行计算,输出光计算结果;及光电探测器阵列,用于对所述光计算结果进行光电转换,得到所述目标AI模型针对所述输入数据的模型处理结果。
- 根据权利要求1所述的计算芯片,其特征在于,所述电域控制器包括:数模转换模块,用于将第一电信号转换为第一模拟信号;及逻辑控制电路,用于传输所述第一模拟信号至所述电光调制器阵列,以便所述电光调制器阵列按照所述第一模拟信号将所述激光信号转换为所述目标光信号。
- 根据权利要求2所述的计算芯片,其特征在于,所述电光调制器阵列具体用于:按照所述第一模拟信号调制所述激光信号的光强度,得到所述目标光信号。
- 根据权利要求2所述的计算芯片,其特征在于,所述数模转换模块还用于将第二电信号转换为第二模拟信号;及所述逻辑控制电路还用于传输所述第二模拟信号至所述可编程光学结构,以便所述可编程光学结构按照所述第二模拟信号调节自身中的移相器。
- 根据权利要求4所述的计算芯片,其特征在于,所述电域控制器还包括:驱动模块,用于驱动逻辑控制电路传输所述第二模拟信号至所述可编程光学结构,以便所述可编程光学结构按照所述第二模拟信号调节自身中的移相器。
- 根据权利要求5所述的计算芯片,其特征在于,所述电域控制器还包括:存储模块,用于存储所述模型处理结果、所述第一电信号、所述第二电信号和/或非线性激活函数的输出结果。
- 根据权利要求1所述的计算芯片,其特征在于,所述信号发射器包括:激光器,用于生成所述激光信号;及与所述激光器连接的光纤阵列,用于将所述激光信号以预设数量的输入路径传输至所 述电光调制器阵列。
- 根据权利要求1所述的计算芯片,其特征在于,还包括:用于实现非线性激活函数的饱和吸收体结构、或双稳态结构或MZI结构。
- 根据权利要求1至8任一项所述的计算芯片,其特征在于,所述可编程光学结构包括:级联MZI结构和并联光衰减器结构。
- 根据权利要求11所述的计算芯片,其特征在于,所述MZI结构通过级联生成任意目标m维酉矩阵。
- 根据权利要求13所述的计算芯片,其特征在于,通过单位矩阵变换获得所述m维待求矩阵Tqp。
- 根据权利要求14所述的计算芯片,其特征在于,所述通过单位矩阵变换获得所述m维待求矩阵Tqp的方法包括:将所述Tqp第p行p列元素替换为u11,第p行q列元素替换为u12,第q行p列元素替换为u21,第q行q列元素替换为u22;所述Tqp的其余对角线元素均取值为1,非对角线元素均取值为0。
- 根据权利要求16所述的计算芯片,其特征在于,任意m维酉矩阵,n维酉矩阵为:VT(n)=DRT(n)RT(n-1)…RT(2)。
- 根据权利要求17所述的计算芯片,其特征在于,任意m维酉矩阵,所述m维酉 矩阵的分解法包括Reck三角分解和Clements矩形分解。
- 一种计算系统,其特征在于,包括:多个如权利要求1至17任一项所述的计算芯片,各计算芯片利用光互联技术连接。
- 一种数据处理方法,其特征在于,应用于如权利要求1至17任一项所述的计算芯片,包括:利用信号发射器发射激光信号;利用电域控制器控制电光调制器阵列将所述激光信号转换为目标光信号;所述目标光信号用于表示目标AI模型的输入数据;利用可编程光学结构对所述目标光信号进行计算,输出光计算结果;所述可编程光学结构实现有所述目标AI模型的模型权重矩阵;及利用光电探测器阵列对所述光计算结果进行光电转换,得到所述目标AI模型针对所述输入数据的模型处理结果。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210417937.3 | 2022-04-21 | ||
CN202210417937.3A CN114520694A (zh) | 2022-04-21 | 2022-04-21 | 一种计算芯片、系统及数据处理方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023201970A1 true WO2023201970A1 (zh) | 2023-10-26 |
Family
ID=81600299
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/118103 WO2023201970A1 (zh) | 2022-04-21 | 2022-09-09 | 一种计算芯片、系统及数据处理方法 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114520694A (zh) |
WO (1) | WO2023201970A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117891023A (zh) * | 2024-03-15 | 2024-04-16 | 山东云海国创云计算装备产业创新中心有限公司 | 光子芯片、异构计算系统、精度调整方法及产品 |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114520694A (zh) * | 2022-04-21 | 2022-05-20 | 苏州浪潮智能科技有限公司 | 一种计算芯片、系统及数据处理方法 |
CN114722746B (zh) * | 2022-05-24 | 2022-11-01 | 苏州浪潮智能科技有限公司 | 一种芯片辅助设计方法、装置、设备及可读介质 |
CN117434998A (zh) * | 2022-07-15 | 2024-01-23 | 南京光智元科技有限公司 | 计算系统及处理光子计算结果的方法 |
CN115222034A (zh) * | 2022-07-27 | 2022-10-21 | 董毅博 | 基于VCSEL阵列的三维光子芯片架构及应用、DNNs结构计算方法 |
CN116777727B (zh) * | 2023-06-21 | 2024-01-09 | 北京忆元科技有限公司 | 存算一体芯片、图像处理方法、电子设备及存储介质 |
CN117294358B (zh) * | 2023-09-26 | 2024-07-05 | 光本位科技(苏州)有限公司 | 基于数字逻辑控制的光子计算单元 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110197277A (zh) * | 2019-05-13 | 2019-09-03 | 浙江大学 | 实现数字识别的光学神经网络方法 |
CN110503196A (zh) * | 2019-08-26 | 2019-11-26 | 光子算数(北京)科技有限责任公司 | 一种光子神经网络芯片以及数据处理系统 |
CN111898741A (zh) * | 2020-08-04 | 2020-11-06 | 上海交通大学 | 基于铌酸锂的片上级联mzi可重构量子网络 |
CN112232504A (zh) * | 2020-09-11 | 2021-01-15 | 联合微电子中心有限责任公司 | 一种光子神经网络 |
CN114520694A (zh) * | 2022-04-21 | 2022-05-20 | 苏州浪潮智能科技有限公司 | 一种计算芯片、系统及数据处理方法 |
-
2022
- 2022-04-21 CN CN202210417937.3A patent/CN114520694A/zh active Pending
- 2022-09-09 WO PCT/CN2022/118103 patent/WO2023201970A1/zh unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110197277A (zh) * | 2019-05-13 | 2019-09-03 | 浙江大学 | 实现数字识别的光学神经网络方法 |
CN110503196A (zh) * | 2019-08-26 | 2019-11-26 | 光子算数(北京)科技有限责任公司 | 一种光子神经网络芯片以及数据处理系统 |
CN111898741A (zh) * | 2020-08-04 | 2020-11-06 | 上海交通大学 | 基于铌酸锂的片上级联mzi可重构量子网络 |
CN112232504A (zh) * | 2020-09-11 | 2021-01-15 | 联合微电子中心有限责任公司 | 一种光子神经网络 |
CN114520694A (zh) * | 2022-04-21 | 2022-05-20 | 苏州浪潮智能科技有限公司 | 一种计算芯片、系统及数据处理方法 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117891023A (zh) * | 2024-03-15 | 2024-04-16 | 山东云海国创云计算装备产业创新中心有限公司 | 光子芯片、异构计算系统、精度调整方法及产品 |
CN117891023B (zh) * | 2024-03-15 | 2024-05-31 | 山东云海国创云计算装备产业创新中心有限公司 | 光子芯片、异构计算系统、精度调整方法及产品 |
Also Published As
Publication number | Publication date |
---|---|
CN114520694A (zh) | 2022-05-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023201970A1 (zh) | 一种计算芯片、系统及数据处理方法 | |
Shastri et al. | Photonics for artificial intelligence and neuromorphic computing | |
US11817903B2 (en) | Coherent photonic computing architectures | |
Bai et al. | Photonic multiplexing techniques for neuromorphic computing | |
Wu et al. | Analog optical computing for artificial intelligence | |
CN111683304B (zh) | 在光波导和/或光芯片上实现的全光衍射神经网络及系统 | |
JP7115691B2 (ja) | フォトニックリザーバコンピューティングシステムのトレーニング | |
Bai et al. | Towards silicon photonic neural networks for artificial intelligence | |
TW202103025A (zh) | 混合類比-數位矩陣處理器 | |
CN112506265B (zh) | 一种光计算装置以及计算方法 | |
CN112232503B (zh) | 计算装置、计算方法以及计算系统 | |
Youngblood | Coherent photonic crossbar arrays for large-scale matrix-matrix multiplication | |
CN113657580A (zh) | 基于微环谐振器和非易失性相变材料的光子卷积神经网络加速器 | |
WO2023109065A1 (zh) | 循环神经网络的实现方法、系统、电子设备及存储介质 | |
JP2024503991A (ja) | 行列計算用のバランス型フォトニック・アーキテクチャ | |
CN114325932B (zh) | 一种片上集成的全光神经网络光计算芯片 | |
WO2023005084A1 (zh) | 光学电路搭建方法、光学电路、光信号处理方法及装置 | |
CN112101540A (zh) | 光学神经网络芯片及其计算方法 | |
CN117436486A (zh) | 一种基于薄膜铌酸锂和硅混合的光学卷积神经网络 | |
US20240013041A1 (en) | Single ended eam with electrical combining | |
CN113325917A (zh) | 一种光计算装置、系统以及计算方法 | |
Brunner et al. | Nonlinear photonic dynamical systems for unconventional computing | |
CN113325650B (zh) | 一种光学电路、光信号处理方法、装置及可读存储介质 | |
CN118261220A (zh) | 一种注意力机制的光学计算装置、计算方法及计算系统 | |
Huang et al. | Photonic computing: an introduction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22938195 Country of ref document: EP Kind code of ref document: A1 |