CN114723016A

CN114723016A - On-chip photon convolution neural network and construction method thereof

Info

Publication number: CN114723016A
Application number: CN202210451306.3A
Authority: CN
Inventors: 张航; 邢壮壮; 李星桥; 李子锐
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2022-04-26
Filing date: 2022-04-26
Publication date: 2022-07-08

Abstract

The invention discloses an on-chip photon convolution neural network and a construction method thereof, wherein the on-chip photon convolution neural network comprises a light source module to generate a plurality of groups of composite signal lights carrying information codes; the convolution module is used for performing convolution operation on the composite signal light to obtain a light calculation result, and the data acquisition and processing module is used for converting the light calculation result into an electric signal and amplifying and processing the electric signal; the control module is used for generating control signals and respectively inputting the control signals to the light source module and the convolution module; the invention simplifies the structure of the neural network, reduces the number of optical devices for realizing the neural network, and is beneficial to the construction of a large-scale silicon optical chip.

Description

On-chip photon convolution neural network and construction method thereof

Technical Field

The invention belongs to the technical field of artificial intelligence and photon calculation, and particularly relates to an on-chip photon convolution neural network and a construction method thereof.

Background

Due to the huge calculation amount of matrix multiplication, the traditional central processing unit is gradually becoming a suboptimal option for realizing a deep learning algorithm, the performance of photons in the aspects of energy consumption and calculation rate is superior to that of electrons, and the photons can realize the matrix multiplication by utilizing the coherence and the superposition of linear optics, so that the method becomes an attractive platform.

The convolution neural network has unique superiority in the aspects of speech recognition, natural language processing, image processing and the like due to the special structure shared by local weights, is widely applied to the field of artificial intelligence, and also becomes one of hot-spot neural network models realized on an optical platform; under the existing conditions, because light is transmitted in a light path with certain loss, the size of a neural network model which can be realized on an optical platform is limited, in addition, the space resources on a chip are limited, part of device control modules also need a power supply, and excessive devices are difficult to be completely integrated on the chip, so that the realization of a large-scale photon convolution neural network has great challenges.

The chinese patent publication CN112232487A provides a light source generator for each optical path, and this method needs to integrate multiple light source generators on a chip, which is costly and results in low chip space utilization; meanwhile, two columns of resonators are used for forming a photon weight unit, wherein each resonator in one column of resonators is used for respectively carrying out first weight coding on a plurality of weights, each resonator in the other column of resonators is used for respectively carrying out second weight coding on the plurality of weights, and the difference between the first weight coding and the second weight coding obtained by using the corresponding resonator in each row of the two columns of resonators is used for obtaining the weights, so that the weight obtaining cost is high, and the realization is not facilitated.

The scale of the existing commonly used neural network model is too large, the needed optical devices are too many when the model is implemented on a chip, the space resources on the chip are limited, the optical devices which are excessively integrated not only increase the manufacturing cost and the loss, but also make the adjustment of the optical devices more complicated, and the loss of light in the transmission process is also accumulated when the model is too large, so that the model cannot be ignored, and the large-scale neural network is difficult to implement on the optical chip.

Disclosure of Invention

The embodiment of the invention aims to provide an on-chip photon convolution neural network and a construction method thereof, which reduce the number of optical devices for realizing the photon neural network, reduce the integration cost of the optical devices, save chip space resources and are beneficial to constructing a large-scale silicon optical convolution neural network chip.

In order to solve the technical problem, the invention adopts the technical scheme that the on-chip photon convolution neural network comprises:

the light source module is used for generating n groups of composite signal lights carrying information codes;

the convolution module is used for performing convolution operation on the composite signal light to obtain a light calculation result;

the data acquisition and processing module is used for converting the light calculation result into an electric signal and amplifying and processing the electric signal;

wherein the light source module includes:

the photon input module is used for generating m multiplied by n groups of separation lasers with different wavelengths;

the information coding module is used for respectively coding the m multiplied by n groups of separated lasers according to the control signals;

n multiplexers for multiplexing the separation laser after the m groups of information are coded to obtain n groups of composite signal light;

and the control module is used for generating a control signal and respectively inputting the control signal to the information coding module and the convolution module.

Further, the photon input module comprises:

a continuous fiber laser for generating continuous laser light;

the high Q value micro-cavity is used for forming a series of lasers with the same frequency distance and different wavelengths in one waveguide;

and the demultiplexer is used for separating the laser in the waveguide according to the wavelength to obtain separated laser.

Furthermore, the information encoding module is composed of m × n electro-optical mach-zehnder regulators, control circuits of the electro-optical mach-zehnder regulators are connected with the control module, input ends of the electro-optical mach-zehnder regulators are connected with the demultiplexer, and output ends of the electro-optical mach-zehnder regulators are connected with the multiplexer.

Furthermore, the convolution module comprises n micro-ring resonators, a control circuit of each micro-ring resonator is connected with the control module, an input port is connected with the output of each multiplexer, and a through port and a drop port are connected with the data acquisition and processing module.

Further, the data acquiring and processing module comprises:

the balanced photoelectric detector is used for converting the optical operation result output by the convolution module into an electric signal;

and the processing module is used for amplifying and processing the electric signal.

The method for constructing the on-chip photon convolution neural network comprises the following steps:

s1, obtaining convolution kernel parameters of an original neural network to be mapped to the optical device, wherein the convolution kernel parameters comprise the number n of convolution kernels, the number m of parameters related to each convolution kernel, and the number m of weights corresponding to the parameters in each convolution kernel;

s2, normalizing the convolution kernel parameters of the original neural network to linearly reduce the values to [ -1,1 ];

s3, optimizing the normalization processing result of the convolution kernel parameters by using a pruning algorithm to obtain a final convolution kernel matrix W;

s4, using micro-ring resonators to realize convolution operation weight coding in the original neural network, setting n micro-ring resonators according to the number of convolution kernels to form an original convolution module, wherein each micro-ring resonator comprises m resonant cavities;

deleting the original convolution module structure according to the convolution kernel matrix W, deleting the resonant cavity in the corresponding micro-ring resonator when partial parameters of a certain convolution kernel in the convolution kernel matrix W are zero, and deleting the corresponding micro-ring resonator when all the parameters of the certain convolution kernel are zero to obtain a final convolution module;

s5, determining the number of composite signal lights output by the light source module and the number of lasers contained in each group of composite signal lights according to the number of micro-ring resonators and resonant cavities in the convolution module determined in S4, and further determining the number of multiplexers, MZIs and the number of balanced photodetectors;

and arranging a corresponding light source input module and a data acquisition and processing module, and mapping the neural network on the chip to form the on-chip photon convolution neural network.

Further, the normalization operation is as follows:

wherein w_minRepresenting the minimum value, w, of the convolution kernel parameter_maxRepresenting the maximum value in the convolution kernel parameters, w representing the original convolution kernel parameters, and w' representing the normalized convolution kernel parameters;

the data acquisition and processing module is provided with an amplification algorithm, and the amplification algorithm is calculated as follows:

y' represents the output of the normalized data after convolution kernel calculation, y represents the amplified signal value, and x represents the input of the convolution module.

Furthermore, the pruning algorithm adopts a channel pruning algorithm, a convolution kernel pruning algorithm, convolution kernel width and height clipping or a sparse matrix method.

The beneficial effect of this embodiment is:

in the embodiment, the optical frequency comb is used as the on-chip light source input module, so that light sources can be provided for a plurality of light paths at the same time, the occupation of the light source module on the on-chip space is reduced, and the utilization efficiency of the on-chip light source and the utilization rate of the on-chip space are improved.

In the embodiment, the convolution kernel of the convolution neural network is optimized by using a pruning algorithm, partial convolution kernel parameters with high correlation are cut off under the condition of not influencing the model precision, unnecessary calculation is reduced, corresponding devices such as a micro-ring resonator and the like are omitted, space resources on a chip are saved, and the calculation efficiency and the space utilization rate of the chip are improved.

In the embodiment, the optical positive and negative weights are realized by using the difference value of two ports of the Add-Drop type micro-ring resonator, the number of optical devices involved in the process is small, the occupation of the space on a chip is small, and the normalization processing is performed on the convolution kernel parameters, so that the realization of a large-scale photonic neural network on the chip is facilitated.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is an overall configuration diagram of the present embodiment.

Fig. 2 is a structural diagram of the light source module of the present embodiment.

In fig. 3: and a is the deleting condition of the optical device after the convolution kernel parameters are completely deleted, and b is the deleting condition of the optical device after the convolution kernel parameters are partially deleted.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The on-chip photon convolution neural network is structurally shown in fig. 1 and comprises a light source module, a convolution module and a data acquisition and processing module, wherein the light source module is used for generating n groups of composite signal lights carrying information codes, the convolution module comprises n micro-ring resonators, input ports of the micro-ring resonators receive the composite signal lights, the modulated signal lights are input into the data acquisition and processing module through a through port and a drop port after the optical information of the composite signal lights is regulated, the modulated signal lights are converted into electric signals, the electric signals are amplified in equal proportion, and then subsequent processing such as data classification and target identification is carried out; and the control module is used for generating control signals, respectively inputting the control signals into the light source module and the convolution module, and controlling the control modules to carry out information coding and convolution operation.

The structure of the light source module is shown in fig. 2, and the light source module includes a photon input module for generating m × n groups of separated lasers with different wavelengths, an information encoding module for performing information encoding on the separated lasers according to a control signal, and a multiplexer for multiplexing the separated lasers to obtain n groups of composite signal lights; the photon input module comprises a continuous fiber laser, a high Q value microcavity and a demultiplexer, wherein continuous laser generated by the continuous fiber laser generates a series of lasers with the same frequency distance and different wavelengths in one waveguide after passing through the high Q value microcavity, the lasers are input into the demultiplexer, and the lasers with different wavelengths in the waveguide are separated to obtain m multiplied by n groups of separated lasers, wherein the high Q value microcavity refers to a high-quality-factor optical micro-ring resonant cavity.

The information coding module comprises m multiplied by n electro-optical Mach-Zehnder regulators (MZI), and a control circuit of the electro-optical Mach-Zehnder regulators (MZI) is connected with the control module and used for generating electric signals according to information to be coded, inputting the electric signals into two arms of the MZI to adjust the refractive index of the electric signals, and further realizing information coding of separated signal light passing through the two arms of the MZI.

Specifically, when information is encoded by using the electro-optical mach-zehnder modulator, convolution kernel parameters used in each convolution operation are sequentially processed into a one-dimensional sequence, when the convolution kernel parameters are 3 × 3, the length m of the one-dimensional sequence is 9, the one-dimensional sequence is used for modulating a group of discrete lasers containing m optical paths within one unit time to complete information encoding, a step length is moved within the next unit time, and the next group of discrete lasers containing m optical paths are modulated; the convolution kernel usually comprises more than one convolution kernel, and when the number of the convolution kernels is n, the operation is carried out in other n-1 groups of discrete lasers with m optical paths to complete information coding corresponding to other n-1 convolution kernels.

After information coding is completed, m × n groups of separated lasers are divided into n groups according to the number m of weights required by convolution operation, and the n groups of separated lasers are multiplexed by n multiplexers respectively to obtain composite signal light.

Performing convolution operation on the composite signal light by using a convolution module to change optical information of the composite signal light, inputting the composite signal light into a data acquisition and processing module, converting an optical signal into an electric signal by using a balanced photoelectric detector, amplifying the acquired current signal by using a processing module, and performing subsequent processing such as classification, target identification and the like; the convolution module comprises n Add-Drop type micro-ring resonators, an input port of each micro-ring resonator is connected with the output of each multiplexer, a through port and a Drop port of each micro-ring resonator are connected with the balance photoelectric detector, and a control circuit of each micro-ring resonator is connected with the control module.

The micro-ring resonator comprises a plurality of resonant cavities arranged between two waveguides, the number of the resonant cavities is the same as the weight number of each convolution core, when composite signal light containing m groups of lasers with different wavelengths is input into the micro-ring resonator, the micro-ring resonator modulates each resonant cavity according to current values input by a control circuit so as to complete laser modulation with different wavelengths, then modulation results of the m lasers with different wavelengths are summed, the sum is input into a through port and a drop port of a balanced photoelectric detector, and corresponding electric signals are obtained by utilizing the optical information difference value of the through port and the drop port.

This embodiment uses continuous fiber laser, high Q value microcavity and demultiplexer cooperation, produces the different separation laser of multiunit wavelength, avoids all being equipped with light source generator for every light path, has reduced light source generator's quantity, considers two problems that exist when projecting convolution operating parameter to optical device simultaneously: 1. the intensity and power of light are non-negative, and the negative convolution kernel parameter is difficult to encode; 2. under the condition of no additional energy supply, physical quantities such as light intensity, power and the like in the light path are always lost and cannot be increased, so that physical realization of convolution kernel parameters with absolute values larger than 1 cannot be realized; the micro-ring resonator is used for forming a convolution module, positive and negative of convolution kernel parameters are realized by using an optical information difference value of two ports of a through port and a drop port, normalization processing is carried out on the convolution operation parameters to enable the values to be in the range of-1, mapping of convolution operation to an optical device is conveniently realized, the using number of the optical device is reduced, the integration cost of the on-chip photonic neural network is reduced, the feasibility of realizing the convolution neural network on the chip is improved, the space resource on the chip is saved, the utilization rate of the space on the chip is improved, and the large-scale silicon optical convolution neural network chip is favorably constructed.

The construction method of the on-chip photon convolution neural network specifically comprises the following steps:

step S1, obtaining convolution kernel parameters in an original neural network to be mapped to the optical device, wherein the convolution kernel parameters comprise the number n of convolution kernels, the number m of parameters related to each convolution kernel, and the number m of weights corresponding to the parameters in each convolution kernel;

step S2, normalization processing is carried out on the convolution kernel parameters to linearly reduce the parameters to [ -1,1] so as to realize physics;

the normalization formula is as follows:

step S3, optimizing the convolution kernel parameters by using a pruning algorithm to obtain a final convolution kernel matrix W, wherein the convolution kernel matrix W is a set formed by n convolution kernels;

the pruning algorithm adopts a channel pruning algorithm, a convolution kernel pruning algorithm and a method of cutting or sparse matrix the width and the height of convolution kernels, and in order to ensure that the output of each pruning algorithm can be mapped onto an optical convolution neural network chip, the output of pruning is a convolution kernel matrix; in the embodiment, partial convolution kernel parameters with high correlation are cut off through a pruning algorithm under the condition of not influencing the model precision, so that the number of optical devices used when the convolution kernel parameters are mapped to the optical neural network is small, the convolution kernel parameters are easy to integrate on a chip, and a large-scale silicon optical neural network chip is favorably constructed;

step S4, using micro-ring resonators to realize convolution operation weight coding in the original neural network, setting n micro-ring resonators according to the number of convolution kernels to form an original convolution module, wherein each micro-ring resonator comprises m resonant cavities;

deleting the original convolution module structure according to the convolution kernel matrix W, deleting the resonant cavity in the corresponding micro-ring resonator when partial parameters of a certain convolution kernel in the convolution kernel matrix W are zero as shown in b in figure 3, and deleting the corresponding micro-ring resonator when all the parameters of the certain convolution kernel are zero as shown in a in figure 3 to obtain a final convolution module;

step S5, determining the number of composite signal lights output by the light source module and the number of lasers contained in each group of composite signal lights according to the number of micro-ring resonators and resonant cavities in the convolution module determined in step S4, and further determining the number of multiplexers, MZIs and the number of balanced photodetectors;

and arranging a corresponding light source input module and a data acquisition and processing module, connecting the control module with the MZI and convolution module to complete the mapping from the neural network to the chip, and forming the on-chip photon convolution neural network.

Since the convolution kernel parameters are linearly reduced in step S1, a corresponding amplification algorithm and an activation function are set in the data acquisition and processing module, and the acquired electrical signal is amplified in equal proportion, so that the calculation result is unchanged, the activation function can be flexibly selected according to the task processed by the convolution neural network model, and the amplification algorithm is as follows:

y' represents the output of the normalized data after convolution kernel calculation, x represents the input of the convolution module, namely the composite signal light carrying the characteristic image pixel information, and y is the restored (amplified) signal value.

The process of the channel pruning algorithm is described assuming that the original neural network is used to process the image:

let the input of the original neural network be X, the output be Y, the scale of X be N.n_i·H_k×W_k]Wherein N represents the total number of feature maps, N_iIndicates the initial input channel number, H_kRepresenting the height, W, of the convolution kernel_kRepresenting the width of the convolution kernel, with Y having a scale of [ N.n ]_i]Then the problem solved is:

suppose | β |₀≤c′

Wherein i represents a variable of the number of channels, β_iVector, β, indicating whether the ith channel is to be pruned_iThe value is 0 or 1, when the value is 0, the channel needs to be pruned, when the value is 1, the channel does not need to be pruned, and beta is beta_iSet of components, W denotes the convolution kernel matrix, X_iRepresents the input of the ith channel with the scale of [ N.H ]_k×W_k]，W_iRepresents the weight output from the slice in the ith channel, with a scale of [ n ]_i+1·H_k×W_k]，n_i+1Indicates the number of output channels of the ith channel, and c' indicates the number of input channels after clipping.

When solving, firstly, l₀Minimization problem optimization to l₁The minimization problem, namely:

suppose | β |₀≤c′，

And then solving by using LASSO regression:

until satisfying | β |₀≤c′

Wherein λ represents a parameter of LASSO regression, and the larger λ is, the higher the penalty is for a linear model with more variables is.

Then to

Solving is carried out, wherein W' is a second-order matrix deformed according to the convolution kernel matrix W and has the size of [ n_i+1·n_i·H_k×W_k]And recovering W according to W' after the solution is completed, and then cutting channels (the micro-ring resonator and the resonant cavity) according to the convolution kernel matrix W.

When the on-chip photonic neural network is used for operation, the modulation parameters of the micro-ring resonators are calculated according to the convolutional neural network processed by the pruning algorithm, the modulation parameters of MZI are calculated according to feature image pixel information to be processed, the photonic input module is used for generating m multiplied by n groups of separated lasers with different wavelengths, the MZI is used for modulating each laser to enable each laser to carry each pixel information in a feature image, the laser is input into the multiplexer to obtain a plurality of beams of composite signal light, the micro-ring resonators controlled by the convolution kernel parameters are used for carrying out weighting operation on each composite signal light to obtain the weighting operation result of each feature image pixel information, subsequent processing such as classification is carried out on the feature image based on the weighting operation result, and the process is short in time consumption and high in efficiency.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on differences from other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. An on-chip photonic convolutional neural network, comprising:

wherein the light source module includes:

the photon input module is used for generating m multiplied by n groups of separated laser with different wavelengths;

n multiplexers, which are used for multiplexing the separation laser after the m groups of information are coded to obtain n groups of composite signal lights;

2. The on-chip photonic convolutional neural network of claim 1, wherein the photonic input module comprises:

a continuous fiber laser for generating continuous laser light;

3. The on-chip photonic convolutional neural network of claim 1, wherein the information encoding module is composed of m × n electro-optical mach-zehnder modulators, a control circuit of each electro-optical mach-zehnder modulator is connected to the control module, an input end of each electro-optical mach-zehnder modulator is connected to the demultiplexer, and an output end of each electro-optical mach-zehnder modulator is connected to the multiplexer.

4. The on-chip photonic convolutional neural network of claim 1, wherein the convolutional module comprises n micro-ring resonators, a control circuit of each micro-ring resonator is connected to the control module, an input port is connected to an output of each multiplexer, and a through port and a drop port are connected to the data acquisition and processing module.

5. The on-chip photonic convolutional neural network of claim 4, wherein the data acquisition and processing module comprises:

6. The method for constructing the on-chip photonic convolutional neural network as claimed in any one of claims 1 to 5, comprising the steps of:

7. The method of constructing an on-chip photonic convolutional neural network of claim 6, wherein the normalization operation is as follows:

8. The method for constructing the on-chip photonic convolutional neural network of claim 6, wherein the pruning algorithm adopts a channel pruning algorithm, a convolutional kernel pruning algorithm, convolutional kernel width and height clipping or a sparse matrix method.