CN116523015A

CN116523015A - Optical neural network training method, device and equipment for process error robustness

Info

Publication number: CN116523015A
Application number: CN202310300927.6A
Authority: CN
Inventors: 郑纪元; 邓辰辰; 郭雨晨; 方璐; 范静涛; 吴嘉敏; 戴琼海
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2023-03-24
Filing date: 2023-03-24
Publication date: 2023-08-01

Abstract

The application relates to the technical field of optical neural networks, in particular to an optical neural network training method, device and equipment with robust process errors, wherein the method comprises the following steps: acquiring processing error distribution of neuron weight parameters in an optical neural network to obtain a weight error model; in the training process of the optical neural network, randomly superposing noise on the neuron weight parameters according to the weight error model until the training is finished, and obtaining the trained neuron weight parameters; and mapping the trained neuron weight parameters into processing parameters of the optical neural network chip. Therefore, the problems that deviation exists in the processing of the optical neural network chip in the related technology, perfect mapping between a theoretical model and the chip processing cannot be guaranteed, and the calibration time is long and the difficulty is high by compensating the phase and amplitude errors of light are solved.

Description

Optical neural network training method, device and equipment for process error robustness

Technical Field

The present disclosure relates to the field of optical neural networks, and in particular, to a method, an apparatus, and a device for training an optical neural network with robust process errors.

Background

The light has the advantages of the fastest propagation speed of the physical space and multidimensional and multi-scale, the light replaces electrons with light, the circuit is replaced by light paths, and the optical computing chip has subversion advantages of high speed, parallelism, low power consumption and the like. Particularly, with the deep development of artificial intelligence algorithms, the mathematical expression of the physical process of limited light propagation in a medium has high similarity with the deep neural network algorithm, and the realization of the neural network calculation by adopting an optical chip is expected to break through the energy efficiency bottleneck of the traditional electronic chip.

In the related art, an optical neural network chip generally needs to obtain parameters of neurons in the network through pre-training on an electronic computer, and then map the parameters to design parameters of a chip structure. However, there is a certain deviation in chip processing, perfect mapping between the theoretical model and the chip processing cannot be guaranteed, and at present, most of the chip processing is finished by adopting a peripheral light path and a circuit and combining an error calibration algorithm to compensate for phase and amplitude errors of light, but the calibration time is long and difficult, and the technical route of calibrating each chip one by one cannot meet the requirement of future mass production.

Disclosure of Invention

The application provides an optical neural network training method, device, electronic equipment and storage medium with robust process errors, so as to solve the problems that deviation exists in optical neural network chip processing in the related technology, perfect mapping of a theoretical model and chip processing cannot be guaranteed, and the problems of long calibration time, high difficulty and the like exist by compensating phase and amplitude errors of light.

An embodiment of a first aspect of the present application provides an optical neural network training method robust to process errors, including the steps of: acquiring processing error distribution of neuron weight parameters in an optical neural network to obtain a weight error model; in the training process of the optical neural network, randomly superposing noise on the neuron weight parameters according to the weight error model until the training is finished, and obtaining the trained neuron weight parameters; and mapping the trained neuron weight parameters into processing parameters of the optical neural network chip.

Optionally, in an embodiment of the present application, the acquiring the processing error distribution of the neuron weight parameter in the optical neural network obtains a weight error model, including: and acquiring error probability distribution and size range of actual processing size and design size of the physical device of the optical neural network chip, establishing a model corresponding relation between the weight parameters of the intermediate layer neurons and the size of the processing physical device, and acquiring an error model of the weight parameters of the intermediate layer neurons.

Optionally, in one embodiment of the present application, the architecture of the optical neural network includes a plurality of intermediate layers, each intermediate layer includes a plurality of neurons, and a physical structural dimension of a neuron corresponds to the neuron weight parameter, and establishing a model correspondence between the intermediate layer neuron weight parameter and the processing dimension includes: and establishing a physical dimension of the neuron and physical characteristic modulation relation function of light according to the physical simulation, and determining a corresponding relation of the model by using the relation function.

Optionally, in an embodiment of the present application, the optical neural network includes any one of a diffractive neural network, an interfering neural network, and a scattering neural network.

Embodiments of a second aspect of the present application provide an optical neural network training device robust to process errors, comprising: the acquisition module is used for acquiring the processing error distribution of the neuron weight parameters in the optical neural network to obtain a weight error model; the determining module is used for randomly superposing noise on the neuron weight parameters according to the weight error model in the training process of the optical neural network until the training is finished, so as to obtain the trained neuron weight parameters; and the mapping module is used for mapping the trained neuron weight parameters into processing parameters of the optical neural network chip.

Optionally, in an embodiment of the present application, the obtaining module is further configured to obtain an error probability distribution and a size range of an actual processing size and a design size of a physical device of the optical neural network chip, establish a model correspondence between an intermediate layer neuron weight parameter of the optical neural network and the physical device size, and obtain the intermediate layer neuron weight parameter error model.

Optionally, in an embodiment of the present application, the architecture of the optical neural network includes a plurality of intermediate layers, each intermediate layer includes a plurality of neurons, and a physical structure size of the neurons corresponds to the neuron weight parameter, and the obtaining module is further configured to establish a modulation relation function of the physical size of the neurons and physical characteristics of the light according to physical simulation, and determine a model correspondence by using the relation function.

An embodiment of a third aspect of the present application provides an electronic device, including: the system comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the optical neural network training method with robust process errors as described in the embodiment.

A fourth aspect of the present application provides a computer readable storage medium having stored thereon a computer program for execution by a processor for implementing the optical neural network training method robust to process errors as described in the above embodiments.

Therefore, the application has at least the following beneficial effects:

according to the embodiment of the application, the weight random error of the optical neural network can be determined, and the corresponding weight random error is added to the weight parameter of the middle layer neuron in the training process of the weight parameter of the neural network, so that different input training sets can meet the requirement of output accuracy, the performance influence caused by errors existing in the processing of the optical neural network chip is reduced, and the robustness of the optical neural network chip is improved. Therefore, the problems that deviation exists in the processing of the optical neural network chip in the related technology, perfect mapping between a theoretical model and the chip processing cannot be guaranteed, and the calibration time is long and the difficulty is high by compensating the phase and amplitude errors of light are solved.

Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flow chart of a method for training an optical neural network robust to process errors, according to an embodiment of the present application;

FIG. 2 is a schematic diagram of diffraction slot processing errors provided in accordance with one embodiment of the present application;

FIG. 3 is a schematic diagram of optical neural network architecture and network training error back propagation according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a physical layout of a diffractive neural network chip according to one embodiment of the present application;

FIG. 5 is a block diagram of an optical neural network training device robust to process errors, according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the drawings are exemplary and intended for the purpose of explaining the present application and are not to be construed as limiting the present application.

The following describes an optical neural network training method, an optical neural network training device, an electronic device and a storage medium which are robust to process errors according to the embodiment of the application with reference to the accompanying drawings. Aiming at the problems in the background art, the application provides an optical neural network training method with robust process errors, in the method, by determining the weight random error of the optical neural network and adding the corresponding weight random error to the weighting parameters of the middle layer neurons in the training process of the weighting parameters of the neural network, different input training sets can meet the requirement of output accuracy, the performance influence caused by errors existing in the processing of the optical neural network chip is reduced, and the robustness of the optical neural network chip is improved. Therefore, the problems that deviation exists in the processing of the optical neural network chip in the related technology, perfect mapping between a theoretical model and the chip processing cannot be guaranteed, and the calibration time is long and the difficulty is high by compensating the phase and amplitude errors of light are solved.

Specifically, fig. 1 is a schematic flow chart of an optical neural network training method with robustness to process errors according to an embodiment of the present application.

As shown in fig. 1, the optical neural network training method robust to process errors includes the following steps:

in step S101, a weight error model is obtained by acquiring a processing error distribution of a neuron weight parameter in the optical neural network.

The optical neural network is not limited to a specific optical neural network, and includes, but is not limited to, a diffraction neural network, an interference neural network, and a scattering neural network.

In one embodiment of the present application, obtaining a processing error distribution of a neuron weight parameter in an optical neural network to obtain a weight error model includes: and acquiring the error probability distribution and the size range of the actual processing size and the design size of the physical device of the optical neural network chip, establishing the model corresponding relation between the weight parameters of the middle layer neurons and the size of the processing physical device, and acquiring the weight parameter error model of the middle layer neurons.

For convenience of explanation, the embodiment of the present application takes the diffraction slot processing schematic diagram shown in fig. 2 as an example, to determine the random error distribution of the neuron weight parameters. Specifically, as shown in fig. 2, the dashed box is a designed physical parameter, but due to non-ideal factors such as exposure and etching processes, the diffraction slot size obtained by actual processing has deviation from the design value, the embodiment of the present application can obtain probability distribution and size range of processing error through parameter fitting by parameter characterization of large-area processing, and obtain corresponding deviation of weight value by combining model correspondence of middle layer neuron weight parameter and chip processing size in the neural network.

In one embodiment of the present application, establishing a model correspondence of the intermediate layer neuron weight parameters to the process dimensions includes: and establishing a phase modulation relation function of the physical size of the diffraction slot and the light according to the physical simulation, and determining a corresponding relation of the model by using the relation function.

In embodiments of the present application, the architecture of the optical neural network generally includes an input layer, an intermediate layer, and an output layer, each intermediate layer including a plurality of neurons, and the physical structure size of the neurons corresponds to the neuron weight parameters, as shown in fig. 3. The optical neural network chip is usually designed by updating the weight parameters of each neuron in the middle layer by an error reverse transfer algorithm in a computer, such as h in fig. 3 ₁ -h _n . For the weight parameter in a specific optical neural network, the weight parameter can be mapped into an optical chip processing physical parameter through a certain model. Taking the diffractive neural network chip in fig. 4 as an example, the diffractive neural network chip is in one-to-one correspondence with the optical neural network architecture in fig. 3, and there are three intermediate layers, each intermediate layer is composed of a plurality of diffraction slots, each or a plurality of diffraction slots can be regarded as a neuron, and a phase modulation relation function between parameters such as physical dimensions of the diffraction slots and light can be established according to physical simulation, wherein a phase change value of the light is a weight parameter of the neuron in the neural network.

In step S102, during the training process of the optical neural network, noise is randomly superimposed on the neuron weight parameters according to the weight error model until the training is completed, so as to obtain the trained neuron weight parameters.

It can be understood that in the embodiment of the application, a proper amount of random error can be added to the neuron weight parameters of the middle layer in the training process of the optical neural network chip to obtain the trained neuron weight parameters, so that different input training sets can meet the requirement of output accuracy, and the robustness of the optical neural network chip is improved. In the actual implementation process, the size range and probability distribution of the random error in the embodiment of the application can be determined by combining chip processing parameter distribution with the mapping relation between the chip physical parameters and the neuron weights.

In step S103, the trained neuron weight parameters are mapped to processing parameters of the optical neural network chip.

According to the embodiment of the application, the training obtained network model neuron weight parameters can be mapped into the processing parameters of the optical neural network chip, so that the influence of the optical neural network chip processing on the neural network performance is reduced.

According to the optical neural network training method with robust process errors, the weight random errors of the optical neural network are determined, and the corresponding weight random errors are added to the weight parameters of the middle layer neurons in the training process of the weight parameters of the neural network, so that different input training sets can meet the requirement of output accuracy, the performance influence caused by errors existing in the processing of the optical neural network chip is reduced, and the robustness of the optical neural network chip is improved. Therefore, the problems that deviation exists in the processing of the optical neural network chip in the related technology, perfect mapping between a theoretical model and the chip processing cannot be guaranteed, and the calibration time is long and the difficulty is high by compensating the phase and amplitude errors of light are solved.

An optical neural network training device robust to process errors according to an embodiment of the present application will be described next with reference to the accompanying drawings.

Fig. 5 is a block schematic diagram of an optical neural network training device robust to process errors in accordance with an embodiment of the present application.

As shown in fig. 5, the optical neural network training device 10 robust to process errors includes: an acquisition module 100, a determination module 200 and a mapping module 300.

The acquiring module 100 is configured to acquire a processing error distribution of a neuron weight parameter in the optical neural network to obtain a weight error model; during the training process of the optical neural network, the determining module 200 randomly superimposes noise on the neuron weight parameters according to the weight error model until the training is finished, so as to obtain the trained neuron weight parameters; the mapping module 300 is configured to map the trained neuron weight parameters to processing parameters of the optical neural network chip.

In one embodiment of the present application, the obtaining module 100 is further configured to obtain an error probability distribution and a size range of an actual processing size and a design size of a physical device of the optical neural network chip, establish a model correspondence relationship between a neural weight parameter of an intermediate layer of the optical neural network and a physical device size, and obtain an error model of the neural weight parameter of the intermediate layer.

In one embodiment of the present application, the architecture of the optical neural network includes a plurality of intermediate layers, each intermediate layer includes a plurality of neurons, and the physical structure size of the neurons corresponds to the neuron weight parameter, and the obtaining module is further configured to establish a relationship function between the physical size of the neurons and the physical characteristics of the light according to the physical simulation, and determine the model correspondence by using the relationship function.

In one embodiment of the present application, the optical neural network includes any one of a diffractive neural network, an interfering neural network, and a scattering neural network.

It should be noted that the foregoing explanation of the embodiment of the optical neural network training method with robust process error is also applicable to the optical neural network training device with robust process error of this embodiment, and will not be repeated here.

According to the optical neural network training device with robust process errors, the weight random errors of the optical neural network are determined, and the corresponding weight random errors are added to the weight parameters of the middle layer neurons in the training process of the weight parameters of the neural network, so that different input training sets can meet the requirement of output accuracy, the performance influence caused by errors existing in the processing of the optical neural network chip is reduced, and the robustness of the optical neural network chip is improved. Therefore, the problems that deviation exists in the processing of the optical neural network chip in the related technology, perfect mapping between a theoretical model and the chip processing cannot be guaranteed, and the calibration time is long and the difficulty is high by compensating the phase and amplitude errors of light are solved.

Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device may include:

a memory 601, a processor 602, and a computer program stored on the memory 601 and executable on the processor 602.

The processor 602, when executing the program, implements the optical neural network training method robust to process errors provided in the above embodiments.

Further, the electronic device further includes:

a communication interface 603 for communication between the memory 601 and the processor 602.

A memory 601 for storing a computer program executable on the processor 602.

The memory 601 may include a high-speed RAM (Random Access Memory ) memory, and may also include a nonvolatile memory, such as at least one disk memory.

If the memory 601, the processor 602, and the communication interface 603 are implemented independently, the communication interface 603, the memory 601, and the processor 602 may be connected to each other through a bus and perform communication with each other. The bus may be an ISA (Industry Standard Architecture ) bus, a PCI (Peripheral Component, external device interconnect) bus, or EISA (Extended Industry Standard Architecture ) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 6, but not only one bus or one type of bus.

Alternatively, in a specific implementation, if the memory 601, the processor 602, and the communication interface 603 are integrated on a chip, the memory 601, the processor 602, and the communication interface 603 may perform communication with each other through internal interfaces.

The processor 602 may be a CPU (Central Processing Unit ) or ASIC (Application Specific Integrated Circuit, application specific integrated circuit) or one or more integrated circuits configured to implement embodiments of the present application.

Embodiments of the present application also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the optical neural network training method robust to process errors as described above.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "N" is at least two, such as two, three, etc., unless explicitly defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more N executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable gate arrays, field programmable gate arrays, and the like.

Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.

Although embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims

1. An optical neural network training method robust to process errors, comprising the steps of:

acquiring processing error distribution of neuron weight parameters in an optical neural network to obtain a weight error model;

in the training process of the optical neural network, randomly superposing noise on the neuron weight parameters according to the weight error model until the training is finished, and obtaining the trained neuron weight parameters;

and mapping the trained neuron weight parameters into processing parameters of the optical neural network chip.

2. The method of claim 1, wherein the obtaining the processing error distribution of the neuron weight parameters in the optical neural network to obtain the weight error model comprises:

and acquiring the error probability distribution and the size range of the actual processing size and the design size of the physical device of the optical neural network chip, establishing the model corresponding relation between the intermediate layer neuron weight parameters of the optical neural network and the physical device size, and acquiring the intermediate layer neuron weight parameter error model.

3. The method of claim 2, wherein the architecture of the optical neural network includes a plurality of intermediate layers, each intermediate layer including a plurality of neurons, and wherein a physical structural dimension of a neuron corresponds to the neuron weight parameter, and wherein establishing a model correspondence of the intermediate layer neuron weight parameter to the process dimension includes:

and establishing a physical dimension of the neuron and physical characteristic modulation relation function of light according to the physical simulation, and determining a corresponding relation of the model by using the relation function.

4. A method according to any one of claims 1-3, wherein the optical neural network comprises any one of a diffractive neural network, an interfering neural network, and a scattering neural network.

5. An optical neural network training device robust to process errors, comprising:

the acquisition module is used for acquiring the processing error distribution of the neuron weight parameters in the optical neural network to obtain a weight error model;

the determining module is used for randomly superposing noise on the neuron weight parameters according to the weight error model in the training process of the optical neural network until the training is finished, so as to obtain the trained neuron weight parameters;

and the mapping module is used for mapping the trained neuron weight parameters into processing parameters of the optical neural network chip.

6. The apparatus of claim 5, wherein the acquisition module is further to:

7. The apparatus of claim 6, wherein the architecture of the optical neural network comprises a plurality of intermediate layers, each intermediate layer comprising a plurality of neurons, and wherein a physical structural dimension of a neuron corresponds to the neuron weight parameter, the acquisition module further to:

8. The apparatus of any of claims 5-7, wherein the optical neural network comprises any one of a diffractive neural network, an interfering neural network, and a scattering neural network.

9. An electronic device, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the method of training an optical neural network robust to process errors as claimed in any one of claims 1 to 4.

10. A computer readable storage medium having stored thereon a computer program, the program being executable by a processor for implementing the optical neural network training method of any one of claims 1-4 that is robust to process errors.