CN116306849A - Training of reverse neural network model and determining method and device of optical processor - Google Patents

Training of reverse neural network model and determining method and device of optical processor Download PDF

Info

Publication number
CN116306849A
CN116306849A CN202310283096.6A CN202310283096A CN116306849A CN 116306849 A CN116306849 A CN 116306849A CN 202310283096 A CN202310283096 A CN 202310283096A CN 116306849 A CN116306849 A CN 116306849A
Authority
CN
China
Prior art keywords
neural network
optical processor
optical
determining
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310283096.6A
Other languages
Chinese (zh)
Inventor
陈锦华
潘炜炜
徐廷廷
吉晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202310283096.6A priority Critical patent/CN116306849A/en
Publication of CN116306849A publication Critical patent/CN116306849A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Optical Communication System (AREA)

Abstract

The embodiment of the invention discloses a method and a device for training a reverse neural network model and determining an optical processor. The training method of the reverse neural network model comprises the following steps: determining a plurality of optical processors and acquiring a QR code matrix matched with each optical processor; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor; generating a transmission matrix corresponding to each optical processor according to each QR code matrix; determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model; wherein the inverse neural network model is used to determine the nanostructure of the optical processor that matches the target transfer matrix. According to the scheme provided by the embodiment of the invention, a reverse neural network model can be trained, and a basis is provided for rapidly determining optical processors with different functions.

Description

Training of reverse neural network model and determining method and device of optical processor
Technical Field
The embodiment of the invention relates to the technical field of neural networks, in particular to a method and a device for training a reverse neural network model and determining an optical processor.
Background
At present, integrated optical neural networks based On Silicon-On-Insulator (SOI) platforms have received attention in the industry due to fast computation speed, low energy consumption and good parallel computing power. The optical processor mainly utilizes a silicon optical chip to realize quick and efficient matrix multiplication, and has lower energy consumption.
The conventional 2 x2 optical processor is a Mach-zehnder interferometer (Mach-Zehnder interferometer, MZI) consisting of a directional coupler, a multimode interferometer and a phase shifter, and MZI is usually larger than 100 μm, which is disadvantageous for high density integration. In order to achieve a high degree of compactness, the industry has proposed the implementation of free-form devices with great degrees of freedom (Degree Of Freedom, DOF) using topological optimization; but compact free-form devices also present difficulties in manufacturing, and therefore, devices like Quick Response (QR) codes that are both highly compact and controllable with minimal feature sizes have received great attention.
Fig. 1 is a schematic diagram of a 2 x2 optical processor based on an SOI platform according to an embodiment of the present invention, where the geometry is to divide the intermediate transmission waveguide into small pieces, as shown in fig. 1, and the different pieces may have two states, etched and non-etched. The intermediate nanostructure pattern thus resembles a QR code, and the different nanostructure patterns affect the transmission matrix of the processor. The design process of such devices typically involves iterative optimization algorithms such as Genetic Algorithm (GA), particle Swarm Optimization (PSO), and Direct Binary (DBS). However, to design a single device with a target optical response, typically hundreds of electromagnetic simulations are required, and this iterative approach would be very laborious if we required multiple 2 x2 optical processors of different functions.
With the tremendous growth of neural networks, one skilled in the art considers whether a reverse neural network model can be trained to quickly determine the optical processors of different functions.
Disclosure of Invention
The embodiment of the invention provides a method and a device for training a reverse neural network model and determining an optical processor, which are used for training the reverse neural network model and providing basis for rapidly determining the optical processors with different functions.
According to an aspect of the embodiment of the present invention, there is provided a training method of a reverse neural network model, including:
determining a plurality of optical processors, and acquiring a QR code matrix matched with each optical processor; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor;
generating a transmission matrix corresponding to each optical processor according to each QR code matrix;
determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model;
wherein the inverse neural network model is used to determine the nanostructure of the optical processor that matches the target transfer matrix.
According to another aspect of the embodiment of the present invention, there is provided a method for determining an optical processor, including:
determining a target transfer matrix corresponding to an optical processor to be determined, inputting the target transfer matrix into a reverse neural network model obtained by training in advance, and outputting a target QR code matrix corresponding to the optical processor to be determined;
determining the nanostructure of the optical processor to be determined according to the target QR code matrix;
the reverse neural network model is obtained by training the training method of the reverse neural network model according to any one of the embodiments of the invention.
According to another aspect of the embodiment of the present invention, there is provided a training apparatus for a reverse neural network model, including:
the optical processor determining module is used for determining a plurality of optical processors and acquiring a QR code matrix matched with each optical processor; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor;
the transmission matrix determining module is used for generating transmission matrixes corresponding to the optical processors according to the QR code matrixes;
the reverse neural network model determining module is used for determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model;
wherein the inverse neural network model is used to determine the nanostructure of the optical processor that matches the target transfer matrix.
According to another aspect of the embodiment of the present invention, there is provided a determining apparatus of an optical processor, including:
the target transfer matrix determining module is used for determining a target transfer matrix corresponding to the optical processor to be determined, inputting the target transfer matrix into a reverse neural network model obtained through training in advance, and outputting a target QR code matrix corresponding to the optical processor to be determined;
the nanostructure determining module is used for determining the nanostructure of the optical processor to be determined according to the target QR code matrix;
the reverse neural network model is obtained by training the training method of the reverse neural network model according to any one of the embodiments of the invention.
According to another aspect of an embodiment of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method for training the inverse neural network model or the method for determining the optical processor according to any of the embodiments of the present invention.
According to another aspect of the embodiments of the present invention, there is provided a computer readable storage medium storing computer instructions for implementing the training method of the inverse neural network model or the determining method of the optical processor according to any of the embodiments of the present invention when the processor is executed.
According to the technical scheme, the plurality of optical processors are determined, and the QR code matrix matched with each optical processor is obtained; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor; generating a transmission matrix corresponding to each optical processor according to each QR code matrix; determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model; the reverse neural network model is used for determining the nanostructure of the optical processor matched with the target transfer matrix, and can train a reverse neural network model so as to provide basis for rapidly determining the optical processors with different functions.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention, nor is it intended to be used to limit the scope of the embodiments of the invention. Other features of embodiments of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a 2 x2 optical processor based on an SOI platform according to an embodiment of the present invention;
FIG. 2 is a flowchart of a training method of a reverse neural network model according to a first embodiment of the present invention;
fig. 3 is a schematic structural diagram of a reverse neural network according to a first embodiment of the present invention;
FIG. 4 is a schematic diagram of a Gaussian mixture distribution model according to a first embodiment of the invention;
FIG. 5 is a schematic view of a mixed density layer structure according to a first embodiment of the present invention;
FIG. 6 is a flow chart of a method for determining an optical processor according to a second embodiment of the present invention;
fig. 7 is a schematic structural diagram of a training device for a reverse neural network model according to a third embodiment of the present invention;
fig. 8 is a schematic structural view of a determining device of an optical processor according to a fourth embodiment of the present invention;
fig. 9 is a schematic structural diagram of an electronic device implementing a training method of an inverse neural network model, or a determination method of an optical processor according to an embodiment of the present invention.
Detailed Description
In order to make the embodiments of the present invention better understood by those skilled in the art, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the embodiments of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the embodiments of the present invention and the above-described drawings are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 2 is a flowchart of a training method of a reverse neural network model according to an embodiment of the present invention, where the embodiment is applicable to training to obtain a reverse neural network model, the method may be performed by a training device of the reverse neural network model, the training device of the reverse neural network model may be implemented in a hardware and/or software form, and the training device of the reverse neural network model may be configured in an electronic device such as a computer, a server or a tablet computer. Specifically, referring to fig. 2, the method specifically includes the following steps:
step 210, determining a plurality of optical processors, and acquiring a QR code matrix matched with each optical processor.
Wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor; it can be understood that the geometry of the optical processor is that the middle transmission waveguide is divided into small blocks, and different small blocks can have two states, namely etching and non-etching, so that the middle nanostructure pattern resembles a QR code; the QR table may be represented in this embodiment by a matrix, i.e. the nanostructure of the optical processor is represented by a QR code matrix.
The dimension of the optical processor may be 2×2, or 3*3, which is not limited in this embodiment.
In an optional implementation manner of this embodiment, determining a plurality of optical processors and acquiring a QR code matrix matched with each of the optical processors may include: determining a plurality of optical processors, and respectively extracting the nanostructure of each optical processor; and obtaining a QR code matrix matched with each optical processor according to the nanostructure of each optical processor.
Alternatively, in this embodiment, a plurality of optical processors may be random, for example, 5, 10, or 50 optical processors, and the QR codes and the transfer matrices of the optical processors are different. In this embodiment, while determining to obtain a plurality of optical processors, the nanostructure of each optical processor may be extracted, and according to the nanostructure of each optical processor, the QR code matrix matched with each optical processor may be obtained.
Alternatively, in this embodiment, a plurality of QR code nanopatterns X may be randomly generated, and the nanostructure is represented by matrix X; in this embodiment, it can be assumed that X is a matrix of 18X 18, i.e
Figure BDA0004138748340000071
And 220, generating a transmission matrix corresponding to each optical processor according to each QR code matrix.
In an optional implementation manner of this embodiment, after determining a plurality of optical processors, a transfer matrix corresponding to each optical processor may also be obtained according to QR code matrix simulation of each optical processor; the transfer matrix T is a 2 x2 matrix, in which each element includes a real part and an imaginary part, so that the transfer matrix T includes 8 parameters in total.
It should be noted that, the dimension of the target transfer matrix with the target optical processor is matched with the dimension of the target optical processor; for example, if the dimension of the optical processor is 2×2, the dimension of the corresponding transfer matrix is also 2×2; if the dimension of the optical processor is 3*3, the dimension of the corresponding transfer matrix is 3*3. That is, in this embodiment, the dimension of each transfer matrix matches the dimension of each optical processor, and each element of each transfer matrix includes a real part and an imaginary part.
In an optional implementation manner of this embodiment, generating, according to each QR code matrix, a transfer matrix corresponding to each optical processor includes: and simulating to obtain a transmission matrix corresponding to each optical processor according to each QR code matrix through preset software.
The preset software may be lunmetric software or other simulation software, which is not limited in this embodiment.
In this embodiment, the transfer matrix corresponding to each optical processor may be obtained by automatic simulation using lumical software.
And 230, determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model.
Wherein the inverse neural network model is used to determine the nanostructure of the optical processor that matches the target transfer matrix.
In an optional implementation manner of this embodiment, after determining to obtain the transmission matrix and each QR code matrix corresponding to each optical processor, the training data set of the inverse neural network model may be further determined according to each transmission matrix and each QR code matrix.
Optionally, in this embodiment, determining the training data set according to each of the transfer matrices and each of the QR code matrices may include: acquiring each element in each transfer matrix, and sequencing each element according to a set sequence; generating each target column vector according to each sequencing result; and respectively combining each target column vector with each QR code matrix to obtain the training data set.
In this embodiment, the transfer matrix T is a 2×2 matrix, where each element includes a real part and an imaginary part, so that the total includes 8 parameters, and the 8 parameters can be sequentially obtained in order from small to large, so as to convert the transfer matrix into a vector Y of 1*8, and further determine (Y, X) as training data, where X is a QR code matrix; the plurality (Y, X) forms the dataset referred to in this embodiment.
The inverse neural network model related in the embodiment is to implement the inverse design task of the 2×2 optical processor, that is, input the target optical response into our network structure, so as to obtain the corresponding nanostructure of the optical processor, that is, the QR code. The 90% of the obtained data set is used as a training set, the 10% is used as a verification set, the predicted X 'is obtained by inputting Y, the optical response corresponding to the X' is simulated by utilizing the Lumerical software, and the optical response is compared with the actual optical response Y, so that the inverse neural network is trained.
Wherein, the reverse neural network may include: at least two fully connected layers, at least two convolution layers, and a mixed density layer. Fig. 3 is a schematic structural diagram of a reverse neural network according to a first embodiment of the present invention; as shown in fig. 3, the inverse neural network includes three full-connected layers, three convolution layers, and one mixed density layer.
It should be noted that, since different QR codes may generate the same optical response, i.e., one Y may correspond to different X, this problem may cause difficulty in convergence or poor effect of the network structure training. The present embodiment thus incorporates a mixed density layer in the network structure. The output of the mixed density layer is a Gaussian mixture distribution, and the Gaussian mixture distribution can solve the problem of multi-value mapping. Fig. 4 is a schematic diagram of a gaussian mixture distribution model according to an embodiment of the present invention, in which a deep convolutional network is used to obtain a predicted QR code X ', and assuming that X' is a matrix of 18X 18, X 'may be flattened to form a matrix of 1X 324, and X' = [ X1, X2, … … X324] T And inputting the mixed density layer. At this point X' can be considered a 324-dimensional vector. The mixed density layer is in essence a mixed gaussian distribution that requires a predicted X'.
In this embodiment, the input of the mixed density layer may be represented by x, the output by t, and the known input is an n-dimensional vector (if the QR code is 18×18 momentMatrix, where n=324), the probability density of the target value can be expressed as a linear combination of multiple kernel functions:
Figure BDA0004138748340000091
wherein alpha is i (x) Known as the mixing coefficient, can be considered as a priori probability of x, phi i The i-th core of the target vector t is represented. Here, the kernel function is a gaussian distribution function, m is how many kernels are selected by the gaussian mixture distribution, and the kernel function Φ is expressed as:
Figure BDA0004138748340000092
where c is the dimension of t, since the input x is n-dimensional, the output t is also n-dimensional, i.e. c=n, where each kernel function is a multi-element gaussian distribution, σ i (x) Is a scalar, mu i (x) Is a vector of the same dimension as the target value t. In fig. 3 we can see that the mixed density layer is directly behind the deep convolutional neural network, outputting a gaussian mixture model. Fig. 5 is a schematic diagram of a mixed density layer structure according to a first embodiment of the present invention, where the front is still a neural network structure and the back is a gaussian mixture model. According to the formula of p (t|x), the parameters to be optimized are alpha i (x),σ i (x),μ i (x) And one p (t|x) has m alpha i (x) M sigma i (x),μ i (x) There are nm scalars in the blend density layer, so the neural network should have m (n+2) output variables. In a gaussian mixture distribution, the sum of all mixing coefficients is 1:
Figure BDA0004138748340000093
in neural networks this can be achieved by a softmax function:
Figure BDA0004138748340000101
wherein z is α Corresponding to one output variable of the neural network, the variance and mean of each Gaussian unit can also be expressed correspondinglyThe method comprises the following steps: sigma (sigma) i =exp(z i α );μ ik =z ik μ
Wherein mu ik Refers to the mean mu of the corresponding ith Gaussian distribution i Since μ is the kth scalar of i Is an n-dimensional vector.
The loss function of the inverse neural network is to find what parameters can maximize the probability of p (t|x) given x, i.e. argmax p (t|x), and we usually write as an error function:
Figure BDA0004138748340000102
wherein,,
Figure BDA0004138748340000103
representing the loss of each sample. And finally, obtaining the output Gaussian mixture distribution by optimizing the parameter value to enable the value of the error function to be minimum.
According to the technical scheme, a plurality of optical processors are determined, and a QR code matrix matched with each optical processor is obtained; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor; generating a transmission matrix corresponding to each optical processor according to each QR code matrix; determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model; the reverse neural network model is used for determining the nanostructure of the optical processor matched with the target transfer matrix, and can train a reverse neural network model so as to provide basis for rapidly determining the optical processors with different functions.
Example two
FIG. 6 is a flow chart of a method for determining an optical processor according to a second embodiment of the present invention; the present embodiment may be applied to a case where the above embodiment is trained to obtain a reverse neural network model to determine an optical processor, where the method may be performed by a determining device of the optical processor, where the determining device of the optical processor may be implemented in hardware and/or software, and the determining device of the optical processor may be configured in an electronic device such as a computer, a server, or a tablet computer. Specifically, referring to fig. 6, the method specifically includes the steps of:
step 610, determining a target transmission matrix corresponding to the optical processor to be determined, inputting the target transmission matrix into a pre-trained inverse neural network model, and outputting a target QR code matrix corresponding to the optical processor to be determined.
In an optional implementation manner of this embodiment, a target transmission matrix corresponding to the optical processor to be determined may be determined according to a design requirement or a functional requirement, and the target transmission matrix is input into the reverse neural network model obtained through training in the foregoing embodiment, and the target transmission matrix is processed by each network layer in the reverse neural network model, so as to output a target QR code matrix corresponding to the optical processor to be determined.
Step 620, determining the nanostructure of the optical processor to be determined according to the target QR code matrix.
In an optional implementation manner of this embodiment, after the QR code matrix corresponding to the optical processor to be determined is obtained through the inverse neural network model, the QR code matrix may be converted, so as to obtain the nanostructure of the optical processor to be determined.
In an optional implementation manner of this embodiment, the target transfer matrix is input into the inverse neural network model, and the target transfer matrix may be processed through all the connection layers to obtain a processing result; and processing the two layers sequentially through the convolution layers to obtain a convolution layer processing result, and inputting the convolution layer processing result into the mixed density layer to obtain the QR code matrix corresponding to the target transmission matrix.
According to the scheme of the embodiment, the target transmission matrix corresponding to the optical processor to be determined is determined, the target transmission matrix is input into a reverse neural network model obtained through training in advance, and the target QR code matrix corresponding to the optical processor to be determined is output; determining the nanostructure of the optical processor to be determined according to the target QR code matrix
The optical processor can be quickly and accurately determined, saving a lot of calculation time.
Example III
Fig. 7 is a schematic structural diagram of a training device for a reverse neural network model according to a third embodiment of the present invention. As shown in fig. 7, the apparatus includes: an optical processor determination module 710, a transfer matrix determination module 720, and an inverse neural network model determination module 730.
An optical processor determining module 710, configured to determine a plurality of optical processors, and obtain a QR code matrix matched with each of the optical processors; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor;
a transfer matrix determining module 720, configured to generate a transfer matrix corresponding to each of the optical processors according to each of the QR code matrices;
the inverse neural network model determining module 730 is configured to determine a training data set according to each transfer matrix and each QR code matrix, and input the training data set into an inverse neural network for iterative training, so as to obtain an inverse neural network model.
According to the scheme of the embodiment, a plurality of optical processors are determined through an optical processor determining module, and a QR code matrix matched with each optical processor is obtained; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor; generating a transmission matrix corresponding to each optical processor according to each QR code matrix through a transmission matrix determining module; the training data set is determined according to the transfer matrixes and the QR code matrixes through the reverse neural network model determining module, and is input into a reverse neural network for iterative training, so that a reverse neural network model is obtained, and a basis is provided for rapidly determining optical processors with different functions.
In an optional implementation manner of this embodiment, the optical processor determining module 710 is specifically configured to determine a plurality of the optical processors, and extract the nanostructures of each optical processor separately; and obtaining a QR code matrix matched with each optical processor according to the nanostructure of each optical processor.
In an optional implementation manner of this embodiment, the transfer matrix determining module 720 is specifically configured to obtain, by using preset software, a transfer matrix corresponding to each of the optical processors according to the QR code matrix simulation;
the dimensions of the transfer matrices are matched with the dimensions of the corresponding optical processor, and each element of each transfer matrix comprises a real part and an imaginary part. In an optional implementation manner of this embodiment, the inverse neural network model determining module 730 is specifically configured to obtain each element in each transfer matrix, and order each element according to a set order;
generating each target column vector according to each sequencing result;
and respectively combining each target column vector with each QR code matrix to obtain the training data set.
In an optional implementation of this embodiment, a dimension of each of the optical processors is 2×2;
the inverse neural network includes: at least two fully connected layers, at least two convolution layers, and a mixed density layer.
The training device of the reverse neural network model provided by the embodiment of the invention can execute the training method of the reverse neural network model provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example IV
Fig. 8 is a schematic structural view of a determining device of an optical processor according to a fourth embodiment of the present invention; as shown in fig. 8, the apparatus includes: the target transfer matrix determination module 810 and the nanostructure determination module 820.
The target transfer matrix determining module 810 is configured to determine a target transfer matrix corresponding to an optical processor to be determined, input the target transfer matrix into a pre-trained inverse neural network model, and output a target QR code matrix corresponding to the optical processor to be determined;
a nanostructure determining module 820, configured to determine a nanostructure of the optical processor to be determined according to the target QR code matrix;
according to the scheme of the embodiment, a target transfer matrix corresponding to an optical processor to be determined is determined through a target transfer matrix determining module, the target transfer matrix is input into a reverse neural network model obtained through training in advance, and a target QR code matrix corresponding to the optical processor to be determined is output; the nanostructure of the optical processor to be determined is determined by the nanostructure determining module according to the target QR code matrix, so that the optical processor can be rapidly and accurately determined, and a large amount of calculation time is saved.
The device for determining the optical processor provided by the embodiment of the invention can execute the method for determining the optical processor provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example five
Fig. 9 shows a schematic diagram of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the embodiments of the invention described and/or claimed herein.
As shown in fig. 9, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as a training method of the inverse neural network model, or a determination method of the optical processor.
In some embodiments, the method of training the inverse neural network model, or the method of determining the optical processor, may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the above-described training method of the inverse neural network model, or the determination method of the optical processor, may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the training method of the inverse neural network model, or the determination method of the optical processor, in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of embodiments of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of embodiments of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the embodiments of the present invention may be performed in parallel, sequentially or in a different order, so long as the desired result of the technical solution of the embodiments of the present invention can be achieved, which is not limited herein.
The above detailed description should not be construed as limiting the scope of the embodiments of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the embodiments of the present invention should be included in the scope of the embodiments of the present invention.

Claims (10)

1. A method for training a reverse neural network model, comprising:
determining a plurality of optical processors, and acquiring a QR code matrix matched with each optical processor; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor;
generating a transmission matrix corresponding to each optical processor according to each QR code matrix;
determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model;
wherein the inverse neural network model is used to determine the nanostructure of the optical processor that matches the target transfer matrix.
2. The method of claim 1, wherein the determining a plurality of optical processors, obtaining a QR code matrix that matches each of the optical processors, comprises:
determining a plurality of optical processors, and respectively extracting the nanostructure of each optical processor;
and obtaining a QR code matrix matched with each optical processor according to the nanostructure of each optical processor.
3. The method of claim 2, wherein the generating a transfer matrix corresponding to each of the optical processors from each of the QR code matrices comprises:
simulating to obtain a transmission matrix corresponding to each optical processor according to each QR code matrix through preset software;
the dimensions of the transfer matrices are matched with the dimensions of the corresponding optical processor, and each element of each transfer matrix comprises a real part and an imaginary part.
4. The method of claim 1, wherein said determining a training data set from each of said transfer matrices and each of said QR code matrices comprises:
acquiring each element in each transfer matrix, and sequencing each element according to a set sequence;
generating each target column vector according to each sequencing result;
and respectively combining each target column vector with each QR code matrix to obtain the training data set.
5. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the dimension of the size of each optical processor is 2 x 2;
the inverse neural network includes: at least two fully connected layers, at least two convolution layers, and a mixed density layer.
6. A method of determining an optical processor, comprising:
determining a target transfer matrix corresponding to an optical processor to be determined, inputting the target transfer matrix into a reverse neural network model obtained by training in advance, and outputting a target QR code matrix corresponding to the optical processor to be determined;
determining the nanostructure of the optical processor to be determined according to the target QR code matrix;
the inverse neural network model is trained by the training method of the inverse neural network model according to any one of claims 1-5.
7. A training device for a reverse neural network model, comprising:
the optical processor determining module is used for determining a plurality of optical processors and acquiring a QR code matrix matched with each optical processor; wherein the optical processor comprises: silicon on insulator, transmission waveguide, and nanostructure; the QR code matrix is matched with the nanostructure of the optical processor;
the transmission matrix determining module is used for generating transmission matrixes corresponding to the optical processors according to the QR code matrixes;
the reverse neural network model determining module is used for determining a training data set according to each transfer matrix and each QR code matrix, and inputting the training data set into a reverse neural network for iterative training to obtain a reverse neural network model;
wherein the inverse neural network model is used to determine the nanostructure of the optical processor that matches the target transfer matrix.
8. A determining apparatus of an optical processor, comprising:
the target transfer matrix determining module is used for determining a target transfer matrix corresponding to the optical processor to be determined, inputting the target transfer matrix into a reverse neural network model obtained through training in advance, and outputting a target QR code matrix corresponding to the optical processor to be determined;
the nanostructure determining module is used for determining the nanostructure of the optical processor to be determined according to the target QR code matrix;
the inverse neural network model is trained by the training method of the inverse neural network model according to any one of claims 1-5.
9. An electronic device, the electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method of training the inverse neural network model of any one of claims 1-5 or the method of determining the optical processor of claim 6.
10. A computer readable storage medium storing computer instructions for causing a processor to perform the method of training the inverse neural network model of any one of claims 1-5 or the method of determining the optical processor of claim 6 when executed.
CN202310283096.6A 2023-03-20 2023-03-20 Training of reverse neural network model and determining method and device of optical processor Pending CN116306849A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310283096.6A CN116306849A (en) 2023-03-20 2023-03-20 Training of reverse neural network model and determining method and device of optical processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310283096.6A CN116306849A (en) 2023-03-20 2023-03-20 Training of reverse neural network model and determining method and device of optical processor

Publications (1)

Publication Number Publication Date
CN116306849A true CN116306849A (en) 2023-06-23

Family

ID=86832130

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310283096.6A Pending CN116306849A (en) 2023-03-20 2023-03-20 Training of reverse neural network model and determining method and device of optical processor

Country Status (1)

Country Link
CN (1) CN116306849A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116911349A (en) * 2023-09-13 2023-10-20 华南师范大学 Optical nano antenna structure prediction network training method, prediction method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116911349A (en) * 2023-09-13 2023-10-20 华南师范大学 Optical nano antenna structure prediction network training method, prediction method and device
CN116911349B (en) * 2023-09-13 2024-01-09 华南师范大学 Optical nano antenna structure prediction network training method, prediction method and device

Similar Documents

Publication Publication Date Title
CN112561068B (en) Simulation method, computing device, classical device, storage device and product
CN112966522B (en) Image classification method and device, electronic equipment and storage medium
JP7291183B2 (en) Methods, apparatus, devices, media, and program products for training models
JP7354320B2 (en) Quantum device noise removal method and apparatus, electronic equipment, computer readable storage medium, and computer program
KR20220005416A (en) Method for training multivariate relationship generation model, electronic device and medium
CN112541590B (en) Quantum entanglement detection method and device, electronic device and storage medium
CN112749300B (en) Method, apparatus, device, storage medium and program product for video classification
CN112529195B (en) Quantum entanglement detection method and device, electronic device and storage medium
CN115906918A (en) Method and device for fine tuning of pre-training model
CN116306849A (en) Training of reverse neural network model and determining method and device of optical processor
CN114202027A (en) Execution configuration information generation method, model training method and device
CN112580732A (en) Model training method, device, equipment, storage medium and program product
CN113255922A (en) Quantum entanglement quantization method and device, electronic device and computer readable medium
US20220398834A1 (en) Method and apparatus for transfer learning
CN114580645A (en) Simulation method, device and equipment for random quantum measurement and storage medium
CN116739099A (en) Quantum state fidelity determination method and device, electronic equipment and medium
CN113344213A (en) Knowledge distillation method, knowledge distillation device, electronic equipment and computer readable storage medium
CN113361717A (en) Training method and device of quantum state data processing model, electronic equipment and medium
CN117634623A (en) Model training method, device, equipment and medium for quantum state generation
CN112561061A (en) Neural network thinning method, apparatus, device, storage medium, and program product
CN117351299A (en) Image generation and model training method, device, equipment and storage medium
CN114897146B (en) Model generation method and device and electronic equipment
CN114021729B (en) Quantum circuit operation method and system, electronic device and medium
EP2955638A1 (en) Methods and systems for processing data
Dutordoir et al. Deep Gaussian process metamodeling of sequentially sampled non-stationary response surfaces

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination