CN116306849A - Training of reverse neural network model and determining method and device of optical processor - Google Patents
Training of reverse neural network model and determining method and device of optical processor Download PDFInfo
- Publication number
- CN116306849A CN116306849A CN202310283096.6A CN202310283096A CN116306849A CN 116306849 A CN116306849 A CN 116306849A CN 202310283096 A CN202310283096 A CN 202310283096A CN 116306849 A CN116306849 A CN 116306849A
- Authority
- CN
- China
- Prior art keywords
- neural network
- optical processor
- optical
- determining
- network model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000003287 optical effect Effects 0.000 title claims abstract description 191
- 238000003062 neural network model Methods 0.000 title claims abstract description 81
- 238000012549 training Methods 0.000 title claims abstract description 81
- 238000000034 method Methods 0.000 title claims abstract description 60
- 239000011159 matrix material Substances 0.000 claims abstract description 164
- 238000012546 transfer Methods 0.000 claims abstract description 65
- 239000002086 nanomaterial Substances 0.000 claims abstract description 56
- 230000005540 biological transmission Effects 0.000 claims abstract description 37
- 238000013528 artificial neural network Methods 0.000 claims abstract description 25
- 239000012212 insulator Substances 0.000 claims abstract description 11
- 229910052710 silicon Inorganic materials 0.000 claims abstract description 11
- 239000010703 silicon Substances 0.000 claims abstract description 11
- 238000004590 computer program Methods 0.000 claims description 16
- 238000003860 storage Methods 0.000 claims description 13
- 238000012163 sequencing technique Methods 0.000 claims description 5
- 239000000758 substrate Substances 0.000 claims 1
- 230000006870 function Effects 0.000 abstract description 17
- 238000010586 diagram Methods 0.000 description 12
- 238000009826 distribution Methods 0.000 description 10
- 239000000203 mixture Substances 0.000 description 10
- 238000012545 processing Methods 0.000 description 10
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 8
- 238000004891 communication Methods 0.000 description 8
- 230000004044 response Effects 0.000 description 7
- 238000013461 design Methods 0.000 description 4
- 238000004088 simulation Methods 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000005265 energy consumption Methods 0.000 description 2
- 238000005530 etching Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012938 design process Methods 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Neurology (AREA)
- Optical Communication System (AREA)
Abstract
The embodiment of the invention discloses a method and a device for training a reverse neural network model and determining an optical processor. The training method of the reverse neural network model comprises the following steps: determining a plurality of optical processors and acquiring a QR code matrix matched with each optical processor; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor; generating a transmission matrix corresponding to each optical processor according to each QR code matrix; determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model; wherein the inverse neural network model is used to determine the nanostructure of the optical processor that matches the target transfer matrix. According to the scheme provided by the embodiment of the invention, a reverse neural network model can be trained, and a basis is provided for rapidly determining optical processors with different functions.
Description
Technical Field
The embodiment of the invention relates to the technical field of neural networks, in particular to a method and a device for training a reverse neural network model and determining an optical processor.
Background
At present, integrated optical neural networks based On Silicon-On-Insulator (SOI) platforms have received attention in the industry due to fast computation speed, low energy consumption and good parallel computing power. The optical processor mainly utilizes a silicon optical chip to realize quick and efficient matrix multiplication, and has lower energy consumption.
The conventional 2 x2 optical processor is a Mach-zehnder interferometer (Mach-Zehnder interferometer, MZI) consisting of a directional coupler, a multimode interferometer and a phase shifter, and MZI is usually larger than 100 μm, which is disadvantageous for high density integration. In order to achieve a high degree of compactness, the industry has proposed the implementation of free-form devices with great degrees of freedom (Degree Of Freedom, DOF) using topological optimization; but compact free-form devices also present difficulties in manufacturing, and therefore, devices like Quick Response (QR) codes that are both highly compact and controllable with minimal feature sizes have received great attention.
Fig. 1 is a schematic diagram of a 2 x2 optical processor based on an SOI platform according to an embodiment of the present invention, where the geometry is to divide the intermediate transmission waveguide into small pieces, as shown in fig. 1, and the different pieces may have two states, etched and non-etched. The intermediate nanostructure pattern thus resembles a QR code, and the different nanostructure patterns affect the transmission matrix of the processor. The design process of such devices typically involves iterative optimization algorithms such as Genetic Algorithm (GA), particle Swarm Optimization (PSO), and Direct Binary (DBS). However, to design a single device with a target optical response, typically hundreds of electromagnetic simulations are required, and this iterative approach would be very laborious if we required multiple 2 x2 optical processors of different functions.
With the tremendous growth of neural networks, one skilled in the art considers whether a reverse neural network model can be trained to quickly determine the optical processors of different functions.
Disclosure of Invention
The embodiment of the invention provides a method and a device for training a reverse neural network model and determining an optical processor, which are used for training the reverse neural network model and providing basis for rapidly determining the optical processors with different functions.
According to an aspect of the embodiment of the present invention, there is provided a training method of a reverse neural network model, including:
determining a plurality of optical processors, and acquiring a QR code matrix matched with each optical processor; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor;
generating a transmission matrix corresponding to each optical processor according to each QR code matrix;
determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model;
wherein the inverse neural network model is used to determine the nanostructure of the optical processor that matches the target transfer matrix.
According to another aspect of the embodiment of the present invention, there is provided a method for determining an optical processor, including:
determining a target transfer matrix corresponding to an optical processor to be determined, inputting the target transfer matrix into a reverse neural network model obtained by training in advance, and outputting a target QR code matrix corresponding to the optical processor to be determined;
determining the nanostructure of the optical processor to be determined according to the target QR code matrix;
the reverse neural network model is obtained by training the training method of the reverse neural network model according to any one of the embodiments of the invention.
According to another aspect of the embodiment of the present invention, there is provided a training apparatus for a reverse neural network model, including:
the optical processor determining module is used for determining a plurality of optical processors and acquiring a QR code matrix matched with each optical processor; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor;
the transmission matrix determining module is used for generating transmission matrixes corresponding to the optical processors according to the QR code matrixes;
the reverse neural network model determining module is used for determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model;
wherein the inverse neural network model is used to determine the nanostructure of the optical processor that matches the target transfer matrix.
According to another aspect of the embodiment of the present invention, there is provided a determining apparatus of an optical processor, including:
the target transfer matrix determining module is used for determining a target transfer matrix corresponding to the optical processor to be determined, inputting the target transfer matrix into a reverse neural network model obtained through training in advance, and outputting a target QR code matrix corresponding to the optical processor to be determined;
the nanostructure determining module is used for determining the nanostructure of the optical processor to be determined according to the target QR code matrix;
the reverse neural network model is obtained by training the training method of the reverse neural network model according to any one of the embodiments of the invention.
According to another aspect of an embodiment of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method for training the inverse neural network model or the method for determining the optical processor according to any of the embodiments of the present invention.
According to another aspect of the embodiments of the present invention, there is provided a computer readable storage medium storing computer instructions for implementing the training method of the inverse neural network model or the determining method of the optical processor according to any of the embodiments of the present invention when the processor is executed.
According to the technical scheme, the plurality of optical processors are determined, and the QR code matrix matched with each optical processor is obtained; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor; generating a transmission matrix corresponding to each optical processor according to each QR code matrix; determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model; the reverse neural network model is used for determining the nanostructure of the optical processor matched with the target transfer matrix, and can train a reverse neural network model so as to provide basis for rapidly determining the optical processors with different functions.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention, nor is it intended to be used to limit the scope of the embodiments of the invention. Other features of embodiments of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a 2 x2 optical processor based on an SOI platform according to an embodiment of the present invention;
FIG. 2 is a flowchart of a training method of a reverse neural network model according to a first embodiment of the present invention;
fig. 3 is a schematic structural diagram of a reverse neural network according to a first embodiment of the present invention;
FIG. 4 is a schematic diagram of a Gaussian mixture distribution model according to a first embodiment of the invention;
FIG. 5 is a schematic view of a mixed density layer structure according to a first embodiment of the present invention;
FIG. 6 is a flow chart of a method for determining an optical processor according to a second embodiment of the present invention;
fig. 7 is a schematic structural diagram of a training device for a reverse neural network model according to a third embodiment of the present invention;
fig. 8 is a schematic structural view of a determining device of an optical processor according to a fourth embodiment of the present invention;
fig. 9 is a schematic structural diagram of an electronic device implementing a training method of an inverse neural network model, or a determination method of an optical processor according to an embodiment of the present invention.
Detailed Description
In order to make the embodiments of the present invention better understood by those skilled in the art, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the embodiments of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the embodiments of the present invention and the above-described drawings are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 2 is a flowchart of a training method of a reverse neural network model according to an embodiment of the present invention, where the embodiment is applicable to training to obtain a reverse neural network model, the method may be performed by a training device of the reverse neural network model, the training device of the reverse neural network model may be implemented in a hardware and/or software form, and the training device of the reverse neural network model may be configured in an electronic device such as a computer, a server or a tablet computer. Specifically, referring to fig. 2, the method specifically includes the following steps:
Wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor; it can be understood that the geometry of the optical processor is that the middle transmission waveguide is divided into small blocks, and different small blocks can have two states, namely etching and non-etching, so that the middle nanostructure pattern resembles a QR code; the QR table may be represented in this embodiment by a matrix, i.e. the nanostructure of the optical processor is represented by a QR code matrix.
The dimension of the optical processor may be 2×2, or 3*3, which is not limited in this embodiment.
In an optional implementation manner of this embodiment, determining a plurality of optical processors and acquiring a QR code matrix matched with each of the optical processors may include: determining a plurality of optical processors, and respectively extracting the nanostructure of each optical processor; and obtaining a QR code matrix matched with each optical processor according to the nanostructure of each optical processor.
Alternatively, in this embodiment, a plurality of optical processors may be random, for example, 5, 10, or 50 optical processors, and the QR codes and the transfer matrices of the optical processors are different. In this embodiment, while determining to obtain a plurality of optical processors, the nanostructure of each optical processor may be extracted, and according to the nanostructure of each optical processor, the QR code matrix matched with each optical processor may be obtained.
Alternatively, in this embodiment, a plurality of QR code nanopatterns X may be randomly generated, and the nanostructure is represented by matrix X; in this embodiment, it can be assumed that X is a matrix of 18X 18, i.e
And 220, generating a transmission matrix corresponding to each optical processor according to each QR code matrix.
In an optional implementation manner of this embodiment, after determining a plurality of optical processors, a transfer matrix corresponding to each optical processor may also be obtained according to QR code matrix simulation of each optical processor; the transfer matrix T is a 2 x2 matrix, in which each element includes a real part and an imaginary part, so that the transfer matrix T includes 8 parameters in total.
It should be noted that, the dimension of the target transfer matrix with the target optical processor is matched with the dimension of the target optical processor; for example, if the dimension of the optical processor is 2×2, the dimension of the corresponding transfer matrix is also 2×2; if the dimension of the optical processor is 3*3, the dimension of the corresponding transfer matrix is 3*3. That is, in this embodiment, the dimension of each transfer matrix matches the dimension of each optical processor, and each element of each transfer matrix includes a real part and an imaginary part.
In an optional implementation manner of this embodiment, generating, according to each QR code matrix, a transfer matrix corresponding to each optical processor includes: and simulating to obtain a transmission matrix corresponding to each optical processor according to each QR code matrix through preset software.
The preset software may be lunmetric software or other simulation software, which is not limited in this embodiment.
In this embodiment, the transfer matrix corresponding to each optical processor may be obtained by automatic simulation using lumical software.
And 230, determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model.
Wherein the inverse neural network model is used to determine the nanostructure of the optical processor that matches the target transfer matrix.
In an optional implementation manner of this embodiment, after determining to obtain the transmission matrix and each QR code matrix corresponding to each optical processor, the training data set of the inverse neural network model may be further determined according to each transmission matrix and each QR code matrix.
Optionally, in this embodiment, determining the training data set according to each of the transfer matrices and each of the QR code matrices may include: acquiring each element in each transfer matrix, and sequencing each element according to a set sequence; generating each target column vector according to each sequencing result; and respectively combining each target column vector with each QR code matrix to obtain the training data set.
In this embodiment, the transfer matrix T is a 2×2 matrix, where each element includes a real part and an imaginary part, so that the total includes 8 parameters, and the 8 parameters can be sequentially obtained in order from small to large, so as to convert the transfer matrix into a vector Y of 1*8, and further determine (Y, X) as training data, where X is a QR code matrix; the plurality (Y, X) forms the dataset referred to in this embodiment.
The inverse neural network model related in the embodiment is to implement the inverse design task of the 2×2 optical processor, that is, input the target optical response into our network structure, so as to obtain the corresponding nanostructure of the optical processor, that is, the QR code. The 90% of the obtained data set is used as a training set, the 10% is used as a verification set, the predicted X 'is obtained by inputting Y, the optical response corresponding to the X' is simulated by utilizing the Lumerical software, and the optical response is compared with the actual optical response Y, so that the inverse neural network is trained.
Wherein, the reverse neural network may include: at least two fully connected layers, at least two convolution layers, and a mixed density layer. Fig. 3 is a schematic structural diagram of a reverse neural network according to a first embodiment of the present invention; as shown in fig. 3, the inverse neural network includes three full-connected layers, three convolution layers, and one mixed density layer.
It should be noted that, since different QR codes may generate the same optical response, i.e., one Y may correspond to different X, this problem may cause difficulty in convergence or poor effect of the network structure training. The present embodiment thus incorporates a mixed density layer in the network structure. The output of the mixed density layer is a Gaussian mixture distribution, and the Gaussian mixture distribution can solve the problem of multi-value mapping. Fig. 4 is a schematic diagram of a gaussian mixture distribution model according to an embodiment of the present invention, in which a deep convolutional network is used to obtain a predicted QR code X ', and assuming that X' is a matrix of 18X 18, X 'may be flattened to form a matrix of 1X 324, and X' = [ X1, X2, … … X324] T And inputting the mixed density layer. At this point X' can be considered a 324-dimensional vector. The mixed density layer is in essence a mixed gaussian distribution that requires a predicted X'.
In this embodiment, the input of the mixed density layer may be represented by x, the output by t, and the known input is an n-dimensional vector (if the QR code is 18×18 momentMatrix, where n=324), the probability density of the target value can be expressed as a linear combination of multiple kernel functions:
wherein alpha is i (x) Known as the mixing coefficient, can be considered as a priori probability of x, phi i The i-th core of the target vector t is represented. Here, the kernel function is a gaussian distribution function, m is how many kernels are selected by the gaussian mixture distribution, and the kernel function Φ is expressed as:
where c is the dimension of t, since the input x is n-dimensional, the output t is also n-dimensional, i.e. c=n, where each kernel function is a multi-element gaussian distribution, σ i (x) Is a scalar, mu i (x) Is a vector of the same dimension as the target value t. In fig. 3 we can see that the mixed density layer is directly behind the deep convolutional neural network, outputting a gaussian mixture model. Fig. 5 is a schematic diagram of a mixed density layer structure according to a first embodiment of the present invention, where the front is still a neural network structure and the back is a gaussian mixture model. According to the formula of p (t|x), the parameters to be optimized are alpha i (x),σ i (x),μ i (x) And one p (t|x) has m alpha i (x) M sigma i (x),μ i (x) There are nm scalars in the blend density layer, so the neural network should have m (n+2) output variables. In a gaussian mixture distribution, the sum of all mixing coefficients is 1:
wherein z is α Corresponding to one output variable of the neural network, the variance and mean of each Gaussian unit can also be expressed correspondinglyThe method comprises the following steps: sigma (sigma) i =exp(z i α );μ ik =z ik μ ;
Wherein mu ik Refers to the mean mu of the corresponding ith Gaussian distribution i Since μ is the kth scalar of i Is an n-dimensional vector.
The loss function of the inverse neural network is to find what parameters can maximize the probability of p (t|x) given x, i.e. argmax p (t|x), and we usually write as an error function:
wherein,,representing the loss of each sample. And finally, obtaining the output Gaussian mixture distribution by optimizing the parameter value to enable the value of the error function to be minimum.
According to the technical scheme, a plurality of optical processors are determined, and a QR code matrix matched with each optical processor is obtained; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor; generating a transmission matrix corresponding to each optical processor according to each QR code matrix; determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model; the reverse neural network model is used for determining the nanostructure of the optical processor matched with the target transfer matrix, and can train a reverse neural network model so as to provide basis for rapidly determining the optical processors with different functions.
Example two
FIG. 6 is a flow chart of a method for determining an optical processor according to a second embodiment of the present invention; the present embodiment may be applied to a case where the above embodiment is trained to obtain a reverse neural network model to determine an optical processor, where the method may be performed by a determining device of the optical processor, where the determining device of the optical processor may be implemented in hardware and/or software, and the determining device of the optical processor may be configured in an electronic device such as a computer, a server, or a tablet computer. Specifically, referring to fig. 6, the method specifically includes the steps of:
In an optional implementation manner of this embodiment, a target transmission matrix corresponding to the optical processor to be determined may be determined according to a design requirement or a functional requirement, and the target transmission matrix is input into the reverse neural network model obtained through training in the foregoing embodiment, and the target transmission matrix is processed by each network layer in the reverse neural network model, so as to output a target QR code matrix corresponding to the optical processor to be determined.
In an optional implementation manner of this embodiment, after the QR code matrix corresponding to the optical processor to be determined is obtained through the inverse neural network model, the QR code matrix may be converted, so as to obtain the nanostructure of the optical processor to be determined.
In an optional implementation manner of this embodiment, the target transfer matrix is input into the inverse neural network model, and the target transfer matrix may be processed through all the connection layers to obtain a processing result; and processing the two layers sequentially through the convolution layers to obtain a convolution layer processing result, and inputting the convolution layer processing result into the mixed density layer to obtain the QR code matrix corresponding to the target transmission matrix.
According to the scheme of the embodiment, the target transmission matrix corresponding to the optical processor to be determined is determined, the target transmission matrix is input into a reverse neural network model obtained through training in advance, and the target QR code matrix corresponding to the optical processor to be determined is output; determining the nanostructure of the optical processor to be determined according to the target QR code matrix
The optical processor can be quickly and accurately determined, saving a lot of calculation time.
Example III
Fig. 7 is a schematic structural diagram of a training device for a reverse neural network model according to a third embodiment of the present invention. As shown in fig. 7, the apparatus includes: an optical processor determination module 710, a transfer matrix determination module 720, and an inverse neural network model determination module 730.
An optical processor determining module 710, configured to determine a plurality of optical processors, and obtain a QR code matrix matched with each of the optical processors; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor;
a transfer matrix determining module 720, configured to generate a transfer matrix corresponding to each of the optical processors according to each of the QR code matrices;
the inverse neural network model determining module 730 is configured to determine a training data set according to each transfer matrix and each QR code matrix, and input the training data set into an inverse neural network for iterative training, so as to obtain an inverse neural network model.
According to the scheme of the embodiment, a plurality of optical processors are determined through an optical processor determining module, and a QR code matrix matched with each optical processor is obtained; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor; generating a transmission matrix corresponding to each optical processor according to each QR code matrix through a transmission matrix determining module; the training data set is determined according to the transfer matrixes and the QR code matrixes through the reverse neural network model determining module, and is input into a reverse neural network for iterative training, so that a reverse neural network model is obtained, and a basis is provided for rapidly determining optical processors with different functions.
In an optional implementation manner of this embodiment, the optical processor determining module 710 is specifically configured to determine a plurality of the optical processors, and extract the nanostructures of each optical processor separately; and obtaining a QR code matrix matched with each optical processor according to the nanostructure of each optical processor.
In an optional implementation manner of this embodiment, the transfer matrix determining module 720 is specifically configured to obtain, by using preset software, a transfer matrix corresponding to each of the optical processors according to the QR code matrix simulation;
the dimensions of the transfer matrices are matched with the dimensions of the corresponding optical processor, and each element of each transfer matrix comprises a real part and an imaginary part. In an optional implementation manner of this embodiment, the inverse neural network model determining module 730 is specifically configured to obtain each element in each transfer matrix, and order each element according to a set order;
generating each target column vector according to each sequencing result;
and respectively combining each target column vector with each QR code matrix to obtain the training data set.
In an optional implementation of this embodiment, a dimension of each of the optical processors is 2×2;
the inverse neural network includes: at least two fully connected layers, at least two convolution layers, and a mixed density layer.
The training device of the reverse neural network model provided by the embodiment of the invention can execute the training method of the reverse neural network model provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example IV
Fig. 8 is a schematic structural view of a determining device of an optical processor according to a fourth embodiment of the present invention; as shown in fig. 8, the apparatus includes: the target transfer matrix determination module 810 and the nanostructure determination module 820.
The target transfer matrix determining module 810 is configured to determine a target transfer matrix corresponding to an optical processor to be determined, input the target transfer matrix into a pre-trained inverse neural network model, and output a target QR code matrix corresponding to the optical processor to be determined;
a nanostructure determining module 820, configured to determine a nanostructure of the optical processor to be determined according to the target QR code matrix;
according to the scheme of the embodiment, a target transfer matrix corresponding to an optical processor to be determined is determined through a target transfer matrix determining module, the target transfer matrix is input into a reverse neural network model obtained through training in advance, and a target QR code matrix corresponding to the optical processor to be determined is output; the nanostructure of the optical processor to be determined is determined by the nanostructure determining module according to the target QR code matrix, so that the optical processor can be rapidly and accurately determined, and a large amount of calculation time is saved.
The device for determining the optical processor provided by the embodiment of the invention can execute the method for determining the optical processor provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example five
Fig. 9 shows a schematic diagram of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the embodiments of the invention described and/or claimed herein.
As shown in fig. 9, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as a training method of the inverse neural network model, or a determination method of the optical processor.
In some embodiments, the method of training the inverse neural network model, or the method of determining the optical processor, may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the above-described training method of the inverse neural network model, or the determination method of the optical processor, may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the training method of the inverse neural network model, or the determination method of the optical processor, in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of embodiments of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of embodiments of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the embodiments of the present invention may be performed in parallel, sequentially or in a different order, so long as the desired result of the technical solution of the embodiments of the present invention can be achieved, which is not limited herein.
The above detailed description should not be construed as limiting the scope of the embodiments of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the embodiments of the present invention should be included in the scope of the embodiments of the present invention.
Claims (10)
1. A method for training a reverse neural network model, comprising:
determining a plurality of optical processors, and acquiring a QR code matrix matched with each optical processor; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor;
generating a transmission matrix corresponding to each optical processor according to each QR code matrix;
determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model;
wherein the inverse neural network model is used to determine the nanostructure of the optical processor that matches the target transfer matrix.
2. The method of claim 1, wherein the determining a plurality of optical processors, obtaining a QR code matrix that matches each of the optical processors, comprises:
determining a plurality of optical processors, and respectively extracting the nanostructure of each optical processor;
and obtaining a QR code matrix matched with each optical processor according to the nanostructure of each optical processor.
3. The method of claim 2, wherein the generating a transfer matrix corresponding to each of the optical processors from each of the QR code matrices comprises:
simulating to obtain a transmission matrix corresponding to each optical processor according to each QR code matrix through preset software;
the dimensions of the transfer matrices are matched with the dimensions of the corresponding optical processor, and each element of each transfer matrix comprises a real part and an imaginary part.
4. The method of claim 1, wherein said determining a training data set from each of said transfer matrices and each of said QR code matrices comprises:
acquiring each element in each transfer matrix, and sequencing each element according to a set sequence;
generating each target column vector according to each sequencing result;
and respectively combining each target column vector with each QR code matrix to obtain the training data set.
5. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the dimension of the size of each optical processor is 2 x 2;
the inverse neural network includes: at least two fully connected layers, at least two convolution layers, and a mixed density layer.
6. A method of determining an optical processor, comprising:
determining a target transfer matrix corresponding to an optical processor to be determined, inputting the target transfer matrix into a reverse neural network model obtained by training in advance, and outputting a target QR code matrix corresponding to the optical processor to be determined;
determining the nanostructure of the optical processor to be determined according to the target QR code matrix;
the inverse neural network model is trained by the training method of the inverse neural network model according to any one of claims 1-5.
7. A training device for a reverse neural network model, comprising:
the optical processor determining module is used for determining a plurality of optical processors and acquiring a QR code matrix matched with each optical processor; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor;
the transmission matrix determining module is used for generating transmission matrixes corresponding to the optical processors according to the QR code matrixes;
the reverse neural network model determining module is used for determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model;
wherein the inverse neural network model is used to determine the nanostructure of the optical processor that matches the target transfer matrix.
8. A determining apparatus of an optical processor, comprising:
the target transfer matrix determining module is used for determining a target transfer matrix corresponding to the optical processor to be determined, inputting the target transfer matrix into a reverse neural network model obtained through training in advance, and outputting a target QR code matrix corresponding to the optical processor to be determined;
the nanostructure determining module is used for determining the nanostructure of the optical processor to be determined according to the target QR code matrix;
the inverse neural network model is trained by the training method of the inverse neural network model according to any one of claims 1-5.
9. An electronic device, the electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method of training the inverse neural network model of any one of claims 1-5 or the method of determining the optical processor of claim 6.
10. A computer readable storage medium storing computer instructions for causing a processor to perform the method of training the inverse neural network model of any one of claims 1-5 or the method of determining the optical processor of claim 6 when executed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310283096.6A CN116306849A (en) | 2023-03-20 | 2023-03-20 | Training of reverse neural network model and determining method and device of optical processor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310283096.6A CN116306849A (en) | 2023-03-20 | 2023-03-20 | Training of reverse neural network model and determining method and device of optical processor |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116306849A true CN116306849A (en) | 2023-06-23 |
Family
ID=86832130
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310283096.6A Pending CN116306849A (en) | 2023-03-20 | 2023-03-20 | Training of reverse neural network model and determining method and device of optical processor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116306849A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116911349A (en) * | 2023-09-13 | 2023-10-20 | 华南师范大学 | Optical nano antenna structure prediction network training method, prediction method and device |
-
2023
- 2023-03-20 CN CN202310283096.6A patent/CN116306849A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116911349A (en) * | 2023-09-13 | 2023-10-20 | 华南师范大学 | Optical nano antenna structure prediction network training method, prediction method and device |
CN116911349B (en) * | 2023-09-13 | 2024-01-09 | 华南师范大学 | Optical nano antenna structure prediction network training method, prediction method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112561068B (en) | Simulation method, computing device, classical device, storage device and product | |
CN112966522B (en) | Image classification method and device, electronic equipment and storage medium | |
JP7291183B2 (en) | Methods, apparatus, devices, media, and program products for training models | |
JP7354320B2 (en) | Quantum device noise removal method and apparatus, electronic equipment, computer readable storage medium, and computer program | |
KR20220005416A (en) | Method for training multivariate relationship generation model, electronic device and medium | |
CN112541590B (en) | Quantum entanglement detection method and device, electronic device and storage medium | |
CN112749300B (en) | Method, apparatus, device, storage medium and program product for video classification | |
CN112529195B (en) | Quantum entanglement detection method and device, electronic device and storage medium | |
CN115906918A (en) | Method and device for fine tuning of pre-training model | |
CN116306849A (en) | Training of reverse neural network model and determining method and device of optical processor | |
CN114202027A (en) | Execution configuration information generation method, model training method and device | |
CN112580732A (en) | Model training method, device, equipment, storage medium and program product | |
CN113255922A (en) | Quantum entanglement quantization method and device, electronic device and computer readable medium | |
US20220398834A1 (en) | Method and apparatus for transfer learning | |
CN114580645A (en) | Simulation method, device and equipment for random quantum measurement and storage medium | |
CN116739099A (en) | Quantum state fidelity determination method and device, electronic equipment and medium | |
CN113344213A (en) | Knowledge distillation method, knowledge distillation device, electronic equipment and computer readable storage medium | |
CN113361717A (en) | Training method and device of quantum state data processing model, electronic equipment and medium | |
CN117634623A (en) | Model training method, device, equipment and medium for quantum state generation | |
CN112561061A (en) | Neural network thinning method, apparatus, device, storage medium, and program product | |
CN117351299A (en) | Image generation and model training method, device, equipment and storage medium | |
CN114897146B (en) | Model generation method and device and electronic equipment | |
CN114021729B (en) | Quantum circuit operation method and system, electronic device and medium | |
EP2955638A1 (en) | Methods and systems for processing data | |
Dutordoir et al. | Deep Gaussian process metamodeling of sequentially sampled non-stationary response surfaces |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |