CN113850298A

CN113850298A - Image identification method and device and related equipment

Info

Publication number: CN113850298A
Application number: CN202111018960.7A
Authority: CN
Inventors: 黄萍; 吴睿振; 陈静静; 王凛
Original assignee: Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Current assignee: Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority date: 2021-09-03
Filing date: 2021-09-03
Publication date: 2021-12-28

Abstract

The application discloses an image recognition method, which is applied to an optical device and comprises the steps of determining an image data set according to a model training instruction when the model training instruction is received; performing model training on the image data set by utilizing a zeroth-order optimization algorithm and a first-order optimization algorithm to obtain an optical neural network model; when an image recognition instruction is received, determining a target image according to the image recognition instruction; identifying the target image by using the optical neural network model to obtain an image identification result; the image recognition method can reduce the operation power consumption, improve the operation speed and further improve the image recognition efficiency. The application also discloses an image recognition device, equipment and a computer readable storage medium, which have the beneficial effects.

Description

Image identification method and device and related equipment

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image recognition method, an image recognition apparatus, an image recognition device, and a computer-readable storage medium.

Background

With the development of science and technology, nowadays, society has entered the era of cloud + AI +5G, and in order to realize the operation requirement of cloud + AI +5G, a special chip supporting a large operation amount is required, the chip is the foundation and core of modern electronic information industry, and is as small as a mobile phone, a computer, a digital camera and as large as 5G, internet of things and cloud computing, all of which are continuous breakthroughs based on chip technology.

With the rapid development of globalization and science and technology, the amount of data to be processed is rapidly increased, and corresponding data processing models and algorithms are also continuously increased, so that the requirements on computing power and power consumption are continuously improved, but the problems of transmission bottleneck, power consumption increase, computing power bottleneck and the like of the existing electronic computer are more and more difficult to meet the requirements on computing power and power consumption in the big data era, so that how to reduce the computing power consumption while improving the computing speed is a critical problem facing the present time.

Disclosure of Invention

The image recognition method can reduce operation power consumption, improve operation speed and further improve image recognition efficiency; another object of the present application is to provide an image recognition apparatus, a device and a computer readable storage medium, all having the above-mentioned advantages.

In a first aspect, the present application provides an image recognition method applied to an optical device, including:

when a model training instruction is received, determining an image data set according to the model training instruction;

performing model training on the image data set by utilizing a zeroth-order optimization algorithm and a first-order optimization algorithm to obtain an optical neural network model;

when an image recognition instruction is received, determining a target image according to the image recognition instruction;

and identifying the target image by using the optical neural network model to obtain an image identification result.

Preferably, the image data set includes a training data set and a testing data set, and the model training is performed on the image data set by using a zeroth-order optimization algorithm and a first-order optimization algorithm to obtain the optical neural network model, including:

performing model training on the training data set by using the zeroth-order optimization algorithm and the first-order optimization algorithm to obtain an initial optical neural network model;

and correcting the initial optical neural network model by using the test data set to obtain the optical neural network model.

Preferably, the zeroth order optimization algorithm is specifically an average random gradient estimation algorithm; the first-order optimization algorithm is specifically an SVRG algorithm.

Preferably, the performing model training on the training data set by using the zeroth-order optimization algorithm and the first-order optimization algorithm to obtain an initial optical neural network model includes:

randomly extracting a training subset from the training data set;

calculating a first derivative of the loss function with respect to the initial model parameters using the training subset;

randomly extracting a preset number of training samples from the training subset;

calculating the preset number of training samples by using the average random gradient estimation algorithm to obtain a first gradient estimation value;

calculating the first gradient estimation value and the first derivative by using the SVRG algorithm to obtain a first derivative estimation value;

calculating the first derivative estimated value and the initial model parameter by using a gradient descent formula to obtain a first model parameter;

taking the first model parameter as the initial model parameter, and returning to the step of randomly acquiring a preset number of training sample images from the training subset until obtaining the model parameter after a preset cycle number;

taking the model parameter after the preset cycle number as the initial model parameter, and returning to the step of randomly acquiring the training subset from the training data set until obtaining the model parameter after the preset iteration number;

and taking the model parameter after the preset iteration times as a final model parameter, and determining the initial optical neural network model according to the final model parameter.

Preferably, the image recognition method further includes:

and optimizing the optical neural network model according to the image recognition result to obtain the optimized optical neural network model.

Preferably, the image dataset is in particular a MNIST dataset.

In a second aspect, the present application further discloses an image recognition apparatus applied to an optical device, including:

the data set acquisition module is used for determining an image data set according to a model training instruction when the model training instruction is received;

the model training module is used for carrying out model training on the image data set by utilizing a zeroth-order optimization algorithm and a first-order optimization algorithm to obtain an optical neural network model;

the image acquisition module is used for determining a target image according to an image identification instruction when the image identification instruction is received;

and the image identification module is used for identifying the target image by utilizing the optical neural network model to obtain an image identification result.

Preferably, the model training module includes:

the model training unit is used for carrying out model training on the training data set by utilizing the zeroth-order optimization algorithm and the first-order optimization algorithm to obtain an initial optical neural network model;

and the model correction unit is used for correcting the initial optical neural network model by using the test data set to obtain the optical neural network model.

In a third aspect, the present application also discloses an image recognition apparatus, including:

a memory for storing a computer program;

a processor for implementing the steps of any of the image recognition methods as described above when executing the computer program.

In a fourth aspect, the present application also discloses a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of any of the image recognition methods described above.

The image recognition method is applied to an optical device and comprises the steps that when a model training instruction is received, an image data set is determined according to the model training instruction; performing model training on the image data set by utilizing a zeroth-order optimization algorithm and a first-order optimization algorithm to obtain an optical neural network model; when an image recognition instruction is received, determining a target image according to the image recognition instruction; and identifying the target image by using the optical neural network model to obtain an image identification result.

Therefore, according to the image recognition method provided by the application, firstly, a zeroth order optimization algorithm and a first order optimization algorithm are utilized to train an optical neural network model, wherein the zeroth order optimization algorithm can be integrated with the first order optimization algorithm, and the real gradient of the first order optimization algorithm is replaced by the zeroth order gradient, so that the model precision is effectively improved; furthermore, based on the characteristics of low delay, low power consumption, high bandwidth and high energy efficiency of the optical device, the training of the optical neural network model is directly carried out on the optical device, namely, the gradient descent optimization process is realized based on the optical device, so that a large amount of optical conversion or complex derivation operation on the optical device is avoided, and the rapid training of the all-optical neural network model is realized.

The image recognition device, the image recognition device and the computer-readable storage medium provided by the application all have the beneficial effects, and are not repeated herein.

Drawings

In order to more clearly illustrate the technical solutions in the prior art and the embodiments of the present application, the drawings that are needed to be used in the description of the prior art and the embodiments of the present application will be briefly described below. Of course, the following description of the drawings related to the embodiments of the present application is only a part of the embodiments of the present application, and it will be obvious to those skilled in the art that other drawings can be obtained from the provided drawings without any creative effort, and the obtained other drawings also belong to the protection scope of the present application.

Fig. 1 is a schematic flowchart of an image recognition method provided in the present application;

FIG. 2 is an exemplary image in a MNIST dataset provided herein;

fig. 3 is a schematic diagram of a unitary matrix decomposition based on a triangular structure according to the present application;

fig. 4 is a schematic structural diagram of an image recognition apparatus provided in the present application;

fig. 5 is a schematic structural diagram of an image recognition apparatus provided in the present application.

Detailed Description

The core of the application is to provide an image identification method, the image identification method can reduce the operation power consumption, improve the operation speed and further improve the image identification efficiency; another core of the present application is to provide an image recognition apparatus, a device and a computer-readable storage medium, which also have the above-mentioned advantages.

In order to more clearly and completely describe the technical solutions in the embodiments of the present application, the technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The embodiment of the application provides an image identification method.

Referring to fig. 1, fig. 1 is a schematic flow chart of an image recognition method provided in the present application, where the image recognition method is applicable to an optical device, and the method includes:

s101: when a model training instruction is received, determining an image data set according to the model training instruction;

this step is intended to achieve the acquisition of an image dataset, which is a sample dataset used for performing optical neural network model training, including a large amount of sample image data of known image classes, i.e., the sample image data corresponds to image class labels. Specifically, when a model training instruction is received, an image data set can be determined according to the model training instruction, wherein the image data set can be attached to the model training instruction in advance, and therefore the optical device can obtain the image data set through instruction analysis; of course, a technician may also input the model training instruction and the image data set at the same time, so that the optical device may obtain the image data set while receiving the model training instruction, and then enter a subsequent model training process.

It should be noted that the image recognition method provided by the present application includes a training process of an optical neural network model and an image recognition process based on the optical neural network model, and both the training process and the image recognition process can be implemented based on an optical device. It will be appreciated that optical computing has advantages over electrical computing, for example, optical signals transmitted at the speed of light can provide a dramatic increase in speed; the light has natural parallel processing capability and mature wavelength division multiplexing technology, so that the data processing capability, capacity, bandwidth and the like are greatly improved; the optical computing power consumption is expected to be as low as 10-18J/bit, and under the same power consumption, the photonic device is hundreds of times faster than an electronic device. Based on the optical device, the optical device has the performance characteristics of low delay, low power consumption, high bandwidth, high energy efficiency and the like, so that the model training and the image recognition based on the optical device can reduce the operation power consumption and improve the operation speed, thereby greatly improving the image recognition efficiency. The optical device may be specifically a photonic chip, and of course, the specific type does not affect the implementation of the present technical solution, which is not limited in the present application.

As a preferred embodiment, the image data set may specifically be a MNIST data set (Mixed National Institute of Standards and Technology database, handwritten digital database).

The preferred embodiment provides a specific type of image dataset, the MNIST dataset. Specifically, the MNIST dataset is a classical small image classification dataset, which includes 70000 sample pictures with handwritten numbers, as shown in fig. 2, fig. 2 is an exemplary image in the MNIST dataset provided by the present application, each sample picture is composed of 28 × 28 pixels, each pixel is represented by a gray scale value, and of course, each sample picture has its corresponding label, which is represented by a single decimal number, and corresponds to a picture category. Besides, the MNIST data set is widely applied in the technical fields of machine learning, deep learning and the like, and is more used for realizing algorithm effect testing.

S102: performing model training on the image data set by utilizing a zeroth-order optimization algorithm and a first-order optimization algorithm to obtain an optical neural network model;

the step aims to realize model training, namely, the image data set is subjected to model training based on a zeroth-order optimization algorithm and a first-order optimization algorithm to obtain the optical neural network model, and the optical neural network model is used for realizing the identification processing of an unknown type of image (namely, a target image).

It can be understood that, at present, the types of operations that can be implemented by an optical device are limited, the complexity of the first derivative operation with a gradient decrease is extremely high and is difficult to implement on the optical device, and if the derivative is calculated based on a traditional electrical signal and model parameters are updated on the optical device, a large number of photoelectric conversions need to be performed, so that a Zeroth-order optimization (ZOO) operation can be selected to replace the first derivative operation, so that training of an all-optical neural network can be completed based on operations that can be implemented on the optical device, such as sampling, matrix multiplication, and summation.

Specifically, the zeroth order optimization algorithm is a series of optimization methods derived for model training, in which an explicit expression of a gradient is unknown under the condition that a relationship between a loss function and a trainable parameter is unknown, and the zeroth order optimization algorithm performs gradient-based parameter updating only by using an output of a neural network, can process a problem with a higher dimension than a conventional method (e.g., bayesian optimization), and can be integrated with the most advanced first order optimization algorithm to replace a true gradient in the first order optimization algorithm with a zeroth order gradient. If the back propagation algorithm is directly used for training on the optical device, a large number of matrix multiplication operations are required, particularly under the condition of large model scale, and the zeroth-order optimization method avoids derivation operations in the traditional high-overhead back propagation algorithm, and can realize the estimation of the first derivative only based on operations such as sampling, function query, simple matrix multiplication and the like.

Furthermore, parameter updating can be realized by combining a zeroth-order optimization algorithm and a first-order optimization algorithm, so that the construction of an optical neural network model is realized, a large amount of photoelectric conversion and complex derivation operation on an optical device are avoided, and the rapid construction of the model is realized.

As a preferred embodiment, the above-mentioned image data set includes a training data set and a testing data set, and the model training is performed on the image data set by using a zeroth order optimization algorithm and a first order optimization algorithm to obtain the optical neural network model, which may include: performing model training on a training data set by utilizing a zeroth-order optimization algorithm and a first-order optimization algorithm to obtain an initial optical neural network model; and correcting the initial optical neural network model by using the test data set to obtain the optical neural network model.

In order to effectively ensure the accuracy of the model and then ensure the accuracy of the image recognition result, the model construction process can be divided into two parts, namely model training and model correction, and the image data set can also be divided into two parts, namely a training data set and a test data set, wherein the training data set is used for realizing model training to obtain the initial optical neural network model, and the test data set is used for realizing model correction to obtain the optical neural network model with higher accuracy. As described above, the MNIST data set contains 70000 sample pictures of handwritten numbers, wherein 60000 sample pictures can be used as a training data set, and 10000 sample pictures can be used as a testing data set, thereby implementing the construction of the optical neural network model.

As a preferred embodiment, the zeroth-order optimization algorithm may specifically be an average random gradient estimation algorithm; the first-order optimization algorithm may specifically be an SVRG (Stochastic Variance Reduced Gradient) algorithm.

The preferred embodiment provides a specific type of zeroth order optimization algorithm and a first order optimization algorithm, wherein the zeroth order optimization algorithm can be specifically an average random gradient estimation algorithm; the first order optimization algorithm may specifically be the SVRG algorithm.

First, common zeroth order optimization algorithms include the following two:

1. random gradient estimation:

where L is the loss function of the optical neural network, Φ is the model parameter,

is a zeroth order approximation of the first derivative, d is the number of optimization variables, S is the batch size, μ > 0 is a smoothing parameter, x_iFor the ith sample data in a batch, u is randomly sampled from a uniform distribution on the unit sphere, and a group of u is shared by training samples corresponding to one batch.

2. Mean random gradient estimation:

compared with the random gradient estimation, the average random gradient estimation introduces a sampling factor q, each parameter updating is based on the average value of q times of sampling, and by sampling q times, the complexity of single iteration is higher, but the variance of the gradient estimation is smaller, so that the convergence speed is increased, and the final convergence precision is more accurate.

Further, the most commonly used first-order optimization algorithm in neural network training is random gradient descent (SGD), each iteration updates a random sample based on one batch, the gradient value estimated based on the SGD is an unbiased estimation of a true gradient value, a variance introduced by the zero-order optimization algorithm is superimposed, the estimation variance of the ZO-SGD algorithm is large, and the accuracy of a solution which is finally converged needs to be improved.

The SVRG algorithm is a modified version of SGD, the idea being to compute the gradient of all samples once every period of time

A single update within each phase updates the current model parameters using the following formula:

wherein, the gradient is calculated at most twice in each updating, L is a loss function to be optimized, phi is a model parameter, the superscript represents the serial number of the outer circulation, the subscript represents the serial number of the inner circulation,

represents the training sample used at the k-th time in the inner loop, x_iRepresents the training sample corresponding to a batch of the ith outer loop.

In the SVRG algorithm, a very important concept called variance reduction is proposed, so that the variance has an upper bound that can be reduced continuously, and thus converges to linear convergence, faster than the SGD, and finally converges to a better solution. Therefore, the ZO-SVRG algorithm can accelerate convergence while reducing estimation errors compared to the ZO-SGD algorithm.

As a preferred embodiment, the performing model training on the training data set by using the zeroth-order optimization algorithm and the first-order optimization algorithm to obtain the initial optical neural network model may include: randomly extracting a training subset from a training data set; calculating a first derivative of the loss function with respect to the initial model parameters using the training subset; randomly extracting a preset number of training samples from the training subsets; calculating a preset number of training samples by using an average random gradient estimation algorithm to obtain a first gradient estimation value; calculating the first gradient estimation value and the first derivative by using an SVRG algorithm to obtain a first derivative estimation value; calculating the first derivative estimated value and the initial model parameter by using a gradient descent formula to obtain a first model parameter; taking the first model parameter as an initial model parameter, and returning to the step of randomly acquiring a preset number of training sample images from the training subset until obtaining the model parameter after the preset cycle number; taking the model parameter after the preset cycle number as an initial model parameter, and returning to the step of randomly acquiring the training subset from the training data set until obtaining the model parameter after the preset iteration number; and taking the model parameters after the preset iteration times as final model parameters, and determining an initial optical neural network model according to the final model parameters.

The image recognition method provided by the preferred embodiment realizes training of an initial optical neural network model based on an average random gradient estimation algorithm and an SVRG algorithm, and specifically, firstly, performs multiple internal loop calculations and iterative operations based on the average random gradient estimation algorithm and the SVRG algorithm, so as to realize updating of model parameters, and obtain model parameters with higher accuracy, namely the final model parameters; further, an initial optical neural network model may be determined based on the final model parameters.

It should be noted that S101 and S102 are training processes of the optical neural network model, and in an actual image recognition process, the above processes are only performed once, that is, performed when image recognition is performed for the first time, and after the execution is finished, the optical neural network model is stored in a preset storage area, so as to be directly called in a subsequent image recognition process.

S103: when an image recognition instruction is received, determining a target image according to the image recognition instruction;

s104: and identifying the target image by using the optical neural network model to obtain an image identification result.

The above steps are intended to achieve image recognition. Specifically, when image recognition is needed, an image recognition instruction can be initiated to the optical device; the optical identification instruction may be initiated by a user based on a corresponding terminal device, or may be automatically triggered in response to a preset condition, for example, image acquisition is performed at a regular time, and then image identification is performed at a regular time. Further, for the optical device, a target image can be determined after receiving an image recognition instruction, wherein the target image is an image needing image recognition, namely an unknown image type and an image needing image recognition; then, the optical neural network model is called from the preset storage area, and the target image is input to the neural network model for recognition processing, so that an image recognition result can be obtained, and the image type of the target image is determined.

As a preferred embodiment, the image recognition method may further include: and optimizing the optical neural network model according to the image recognition result to obtain the optimized optical neural network model.

The image recognition method provided by the preferred embodiment can realize the model optimization function, namely, the optimized optical neural network model can be obtained by optimizing the optical neural network model by using the image recognition result, so that the model precision can be effectively improved and the accuracy of the image recognition result can be further improved by optimizing the optical neural network model. It is conceivable that the greater the number of acquired image recognition results, the higher the optimization precision of the optical neural network model, and the higher the accuracy of the subsequent image recognition results.

Based on the above embodiments, the present application provides another image recognition method.

In recent years, Optical Neural Networks (ONN) have proven to be an emerging neuromorphic platform featuring ultra-low latency, high bandwidth, and energy efficiency, where computationally intensive operations such as matrix multiplication can be efficiently implemented optically at the speed of light. Therefore, the image recognition method provided by the application can realize image recognition processing by constructing the optical neural network model.

ONN, one of the research focuses is to implement optical linear operation based on MZI (Mach-Zender interferometer), and the computation-intensive operations in neural networks, such as matrix multiplication, can be implemented based on the topological structure of MZI (such as GridNet and FFTNet), and the specific implementation method is to perform singular value decomposition on the weight matrix. The singular value decomposition is an important matrix decomposition mode, is one of algorithms commonly used in machine learning, and is widely applied to feature extraction, data simplification and recommendation systems. The real matrix of any dimension can be decomposed into the product of three matrixes by a singular value decomposition method, assuming that W is a matrix of m x n, U is a matrix of m x m and is called a unitary matrix, sigma is a diagonal matrix of m x n, the value on the diagonal is nonnegative real number, V is a matrix of n x n and is also a unitary matrix, and V is used^*To represent the complex conjugate matrix of V, and the singular value decomposition of matrix W can be represented by:

W＝U∑V^*；

the unitary matrixes U and V in the above formula can be realized based on MZI topological structures, and matrix multiplication operation of any dimensionality can be realized in an optical domain by using structures such as a programmable phase shifter, a Mach-Zehnder interferometer and the like on a mathematical model in a singular value decomposition mode.

Further, a common approach to ONN training is to train the model on a pure software engine, train the model parameters using a back propagation algorithm, and then map the trained model to the optics for inference by matrix singular value decomposition and MZI topology parameterization. However, this approach suffers from the following drawbacks in terms of efficiency, performance and robustness:

(1) pure software-based ONN training is limited by digital computer performance, while considering the computational cost of matrix decomposition and parameterization, digital computers are inefficient in simulating ONN architectures;

(2) optical devices introduce certain noise due to manufacturing variations and control inaccuracies, thermal effects, etc., and pure software trained ONN models result in severe performance degradation and poor robustness due to lack of accurate non-ideal modeling.

For noise introduced in practical application, one scheme is to model and compensate the noise, but the existing compensation method has high redundancy and is not suitable for a large-scale MZI array; the other scheme is to perform compensation in a self-reconstruction mode, namely, model training is directly performed on the basis of an optical device, and the original model parameter W is mapped into the parameter phi of the MZI array_iAnd the influence caused by part of noise can be counteracted by directly carrying out training and deduction on a device with noise.

Therefore, considering the characteristics of low delay, high bandwidth and high energy efficiency of the optical device, ONN training can be selected to be directly performed on the optical device, that is, the gradient descent optimization process is realized based on the optical device, and the ultrafast photonic chip is fully utilized to accelerate the training process. Since the parameterization process is differentiable and according to the chain rule, there are:

wherein the content of the first and second substances,

decomposing an n × n unitary matrix U based on the triangular structure shown in fig. 3 (fig. 3 is a schematic diagram of decomposition of a unitary matrix based on a triangular structure provided by the present application);

wherein the matrix U can be decomposed into a diagonal matrix D (dimension n x n) multiplied by a series of transformation matrices T_ij(dimension n x n), T_ijThe corresponding parameter is phi_ijConversion matrix T_ijThe positions (i, i), (i +1, j), (i, j +1), (j, j) are sin phi, cos phi, -sin phi, respectively, the remaining diagonal elements are 1, the off-diagonal elements are 0, and phi is the model parameter.

In addition, the first and second substrates are,

for the derivation of the partial derivative sign, L is a loss function whose derivative with respect to the model parameter phi is equal to the derivative of the loss function L with respect to the unitary matrix U multiplied by the derivative of the unitary matrix U with respect to phi according to the chain law.

At ONN, the training to represent all programmable MZI phases as Φ, ONN is to continually update the model parameters Φ based on the following gradient descent formula to minimize the loss function, where α is the learning rate,. v_ΦL is the first derivative of the loss function with respect to the parameter:

Φ←Φ-α▽_ΦL。

on the basis, the training process for realizing the optical neural network model based on the ZO-SVRG algorithm can comprise the following steps:

1. input ONN loss function L (-) initial MZI phase Φ⁽⁰⁾(i.e., initial model parameters), image dataset X, learning rate α, total number of iterations t, number of inner loops m, batchsize＝n，mini-batch＝b；

2. Randomly select n training samples and labels in dataset X as { X_i,y_i}，x_iAnd y_iRespectively, the samples and labels selected in the ith iteration (outer loop), and the derivative of the loss function L (.) on the model parameters over the batch of samples is estimated based on the n training samples

3. Note the book

At x_iRandomly select b samples as

And calculating a gradient estimate

And

4. calculating the loss function L (-) with respect to the parameter Φ⁽ⁱ⁾Estimate of the first derivative of (a):

and updating the model parameters based on a gradient descent formula:

5. recycling the steps 3-4 (inner circulation), wherein k is 0,1, … m-1 to obtain

Order to

6. Recycling the steps 2-5 (outer circulation), i is 0,1, … t-1 to obtain phi^(t)Are the final model parameters.

Therefore, the optical neural network model can be determined based on the final model parameters, and further, the optical neural network model can be applied to the field of image recognition, so that the image recognition can be realized more quickly and efficiently.

Therefore, according to the image recognition method provided by the embodiment of the application, firstly, a zeroth order optimization algorithm and a first order optimization algorithm are utilized to train an optical neural network model, wherein the zeroth order optimization algorithm can be integrated with the first order optimization algorithm, and the real gradient of the first order optimization algorithm is replaced by the zeroth order gradient, so that the model precision is effectively improved; furthermore, based on the characteristics of low delay, low power consumption, high bandwidth and high energy efficiency of the optical device, the training of the optical neural network model is directly carried out on the optical device, namely, the gradient descent optimization process is realized based on the optical device, so that a large amount of optical conversion or complex derivation operation on the optical device is avoided, and the rapid training of the all-optical neural network model is realized.

To solve the above technical problem, the present application further provides an image recognition apparatus, please refer to fig. 4, where fig. 4 is a schematic structural diagram of the image recognition apparatus provided in the present application, the image recognition apparatus can be applied to an optical device, and the image recognition apparatus includes:

the data set acquisition module 1 is used for determining an image data set according to a model training instruction when the model training instruction is received;

the model training module 2 is used for performing model training on the image data set by utilizing a zeroth-order optimization algorithm and a first-order optimization algorithm to obtain an optical neural network model;

the image acquisition module 3 is used for determining a target image according to the image identification instruction when receiving the image identification instruction;

and the image recognition module 4 is used for recognizing the target image by using the optical neural network model to obtain an image recognition result.

Therefore, the image recognition device provided by the embodiment of the application firstly utilizes the zeroth order optimization algorithm and the first order optimization algorithm to train the optical neural network model, wherein the zeroth order optimization algorithm can be integrated with the first order optimization algorithm, and the real gradient of the first order optimization algorithm is replaced by the zeroth order gradient, so that the model precision is effectively improved; furthermore, based on the characteristics of low delay, low power consumption, high bandwidth and high energy efficiency of the optical device, the training of the optical neural network model is directly carried out on the optical device, namely, the gradient descent optimization process is realized based on the optical device, so that a large amount of optical conversion or complex derivation operation on the optical device is avoided, and the rapid training of the all-optical neural network model is realized.

As a preferred embodiment, the model training module 2 may include:

the model training unit is used for carrying out model training on the training data set by utilizing a zeroth-order optimization algorithm and a first-order optimization algorithm to obtain an initial optical neural network model;

As a preferred embodiment, the zeroth-order optimization algorithm may specifically be an average random gradient estimation algorithm; the first-order optimization algorithm may be specifically an SVRG algorithm.

As a preferred embodiment, the model training unit may be specifically configured to randomly extract a training subset from a training data set; calculating a first derivative of the loss function with respect to the initial model parameters using the training subset; randomly extracting a preset number of training samples from the training subsets; calculating a preset number of training samples by using an average random gradient estimation algorithm to obtain a first gradient estimation value; calculating the first gradient estimation value and the first derivative by using an SVRG algorithm to obtain a first derivative estimation value; calculating the first derivative estimated value and the initial model parameter by using a gradient descent formula to obtain a first model parameter; taking the first model parameter as an initial model parameter, and returning to the step of randomly acquiring a preset number of training sample images from the training subset until obtaining the model parameter after the preset cycle number; taking the model parameter after the preset cycle number as an initial model parameter, and returning to the step of randomly acquiring the training subset from the training data set until obtaining the model parameter after the preset iteration number; and taking the model parameters after the preset iteration times as final model parameters, and determining an initial optical neural network model according to the final model parameters.

As a preferred embodiment, the image recognition apparatus may further include a model optimization module, configured to perform optimization processing on the optical neural network model according to the image recognition result, so as to obtain an optimized optical neural network model.

As a preferred embodiment, the image dataset may specifically be an MNIST dataset.

For the introduction of the apparatus provided in the present application, please refer to the above method embodiments, which are not described herein again.

To solve the above technical problem, the present application further provides an image recognition system, please refer to fig. 5, where fig. 5 is a schematic structural diagram of an image recognition apparatus provided in the present application, and the image recognition apparatus may include:

a memory 10 for storing a computer program;

the processor 20, when executing the computer program, may implement the steps of any of the image recognition methods described above.

For the introduction of the device provided in the present application, please refer to the above method embodiment, which is not described herein again.

To solve the above problem, the present application further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, can implement the steps of any one of the image recognition methods described above.

The computer-readable storage medium may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

For the introduction of the computer-readable storage medium provided in the present application, please refer to the above method embodiments, which are not described herein again.

The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The technical solutions provided by the present application are described in detail above. The principles and embodiments of the present application are explained herein using specific examples, which are provided only to help understand the method and the core idea of the present application. It should be noted that, for those skilled in the art, without departing from the principle of the present application, several improvements and modifications can be made to the present application, and these improvements and modifications also fall into the protection scope of the present application.

Claims

1. An image recognition method applied to an optical device, comprising:

2. The image recognition method of claim 1, wherein the image dataset comprises a training dataset and a testing dataset, and the model training of the image dataset by using a zeroth order optimization algorithm and a first order optimization algorithm to obtain the optical neural network model comprises:

3. The image recognition method according to claim 2, wherein the zeroth order optimization algorithm is specifically an average random gradient estimation algorithm; the first-order optimization algorithm is specifically an SVRG algorithm.

4. The image recognition method of claim 3, wherein the model training of the training data set using the zeroth order optimization algorithm and the first order optimization algorithm to obtain an initial optical neural network model comprises:

randomly extracting a training subset from the training data set;

5. The image recognition method according to claim 1, further comprising:

6. Image recognition method according to claim 1, characterized in that the image dataset is in particular a MNIST dataset.

7. An image recognition apparatus, applied to an optical device, comprising:

8. The image recognition method of claim 7, wherein the model training module comprises:

9. An image recognition apparatus characterized by comprising:

a memory for storing a computer program;

a processor for implementing the steps of the image recognition method according to any one of claims 1 to 6 when executing the computer program.

10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the image recognition method according to any one of claims 1 to 6.