CN111899147B - Convolution kernel calculation accelerator and convolution kernel calculation method - Google Patents

Convolution kernel calculation accelerator and convolution kernel calculation method Download PDF

Info

Publication number
CN111899147B
CN111899147B CN202010549461.XA CN202010549461A CN111899147B CN 111899147 B CN111899147 B CN 111899147B CN 202010549461 A CN202010549461 A CN 202010549461A CN 111899147 B CN111899147 B CN 111899147B
Authority
CN
China
Prior art keywords
optical signal
convolution kernel
matrix
time delay
modulator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010549461.XA
Other languages
Chinese (zh)
Other versions
CN111899147A (en
Inventor
王兴军
舒浩文
白博文
邹卫文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Shanghai Jiaotong University
Original Assignee
Peking University
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University, Shanghai Jiaotong University filed Critical Peking University
Priority to CN202010549461.XA priority Critical patent/CN111899147B/en
Publication of CN111899147A publication Critical patent/CN111899147A/en
Application granted granted Critical
Publication of CN111899147B publication Critical patent/CN111899147B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Abstract

Embodiments of the present invention provide a convolution kernel calculation accelerator and a convolution kernel calculation method, which can obtain an operation result of a convolution kernel with less time and power consumption. The convolution kernel computation accelerator includes: the optical signal generator, the first modulator, the first time delay module, the second modulator, the second time delay module, the detector and the signal processing module are connected in sequence; the optical signal generator is used for emitting optical signals; the first modulator is used for loading a first matrix of image convolution kernel operation on the optical signal, and the second modulator is used for loading a second matrix of image convolution kernel operation on the optical signal; the first time delay module is used for setting a first time delay amount of the optical signal, and the second time delay module is used for setting a second time delay amount of the optical signal; the detector is used for sampling the optical signal and outputting a target current signal at a target sampling moment; the signal processing module is used for converting the target current signal into a calculation result according to the incidence relation.

Description

Convolution kernel calculation accelerator and convolution kernel calculation method
Technical Field
The invention relates to the field of photon calculation, in particular to a convolution kernel calculation accelerator and a convolution kernel calculation method.
Background
Convolution kernel techniques have wide application in modern image processing techniques. The convolution kernel is a basic operation unit of a processing scheme of image blurring, sharpening, edge detection and the like, is generally a small-scale matrix, and performs convolution operation with a large-scale matrix with image information to re-weight information of each pixel in an image, so that specific effects of image blurring, sharpening, edge detection and the like are finally achieved, and extraction of image features is realized.
The convolution kernel operation of an image refers to bit-wise multiplication and summation of two matrixes with the same size, and assuming that the two matrixes participating in the convolution kernel operation are respectively X and Y, as shown in the following formula:
Figure BDA0002541933080000011
the operation result S of the convolution kernel can be expressed as:
Figure BDA0002541933080000012
specifically, the operation result S of the convolution kernel may be expressed as a result of bit-wise multiplication and summation of the X matrix rotated by 180 ° counterclockwise. Since the rotation process of the matrix can be implemented during matrix encoding, the operation result of the simplified convolution kernel can be expressed as:
Figure BDA0002541933080000013
however, in the process of obtaining the operation result of the convolution kernel through the matrix operation, a large number of binary electrical logic gates are required, and since the switching speed of the transistors adopted by the electrical logic gates is limited, much time and power consumption are consumed in the matrix operation process, so that much time and power consumption are required to obtain the operation result of the convolution kernel.
Disclosure of Invention
The embodiment of the invention provides a convolution kernel calculation accelerator and a convolution kernel calculation method, which are used for solving the problem that the operation result of a convolution kernel needs to be obtained by consuming more time and power consumption in the prior art.
An embodiment of the present invention provides a convolution kernel calculation accelerator, including:
the optical signal generator, the first modulator, the first time delay module, the second modulator, the second time delay module, the detector and the signal processing module are connected in sequence;
the optical signal generator is used for emitting optical signals;
the first modulator is used for loading a first matrix of image convolution kernel operation on the optical signal, and the second modulator is used for loading a second matrix of the image convolution kernel operation on the optical signal;
the first time delay module is configured to set a first time delay amount of the optical signal, and the second time delay module is configured to set a second time delay amount of the optical signal, where under the first time delay amount and the second time delay amount, when the detector samples the optical signal, there is a target sampling time at which the first matrix and the second matrix can be bit-wise multiplied and summed;
the detector is used for sampling the optical signal and outputting a target current signal at the target sampling moment, wherein the target current signal has an incidence relation with the calculation results of bit-wise multiplication and summation of the first matrix and the second matrix;
the signal processing module is used for converting the target current signal into the calculation result according to the incidence relation.
Optionally, the first modulator is specifically configured to load a first matrix of the image convolution kernel operation onto a plurality of adjacent comb teeth of the optical signal in a light intensity form according to a preset loading interval, and the second modulator is specifically configured to load a second matrix of the image convolution kernel operation onto the plurality of adjacent comb teeth of the optical signal in the light intensity form according to the preset loading interval.
Optionally, the first delay module is specifically configured to set a first dispersion delay between a plurality of adjacent comb teeth of the optical signal, and the second delay module is specifically configured to set a second dispersion delay between a plurality of adjacent comb teeth of the optical signal.
Optionally, the detector is specifically configured to sample a plurality of adjacent comb teeth of the optical signal.
Optionally, the optical signal generator comprises an optical frequency comb generator for emitting an optical frequency comb.
Optionally, the first delay module is a first dispersion medium, and the second delay module is a second dispersion medium.
Optionally, an optical connection structure is adopted among the optical signal generator, the first modulator, the first dispersion medium, the second modulator, the second dispersion medium, and the detector, and an electrical connection structure is adopted between the detector and the signal processing module.
The embodiment of the invention provides a convolution kernel calculation method, which comprises the following steps:
loading a first matrix of the image convolution kernel operation on the first optical signal;
carrying out first time delay processing on the first optical signal according to a preset first time delay amount to obtain a second optical signal;
loading a second matrix of the image convolution kernel operation onto the second optical signal;
performing second time delay processing on the second optical signal according to a preset second time delay amount to obtain a third optical signal; wherein, under the first and second time delay amounts, when sampling the third optical signal, there is a target sampling time at which the first and second matrices can be bit-wise multiplied and summed;
sampling the third optical signal, and outputting a target current signal at the target sampling moment, wherein the target current signal has an incidence relation with the calculation results of bit-wise multiplication and summation of the first matrix and the second matrix;
and converting the target current signal into the calculation result according to the incidence relation.
An embodiment of the present invention provides an electronic device, including a memory, a processor, a controller, and a computer program stored in the memory and executable on the processor, where the controller is configured to control the above convolution kernel calculation accelerator, and the processor implements the above method when executing the program.
Embodiments of the present invention provide a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor, implements the above-described method.
According to the convolution kernel calculation accelerator and the convolution kernel calculation method provided by the embodiment of the invention, after the optical signals are transmitted through each photoelectric device, the current signals having the correlation with the calculation results of the convolution kernels can be obtained, so that the calculation results of the convolution kernels corresponding to the current signals can be obtained according to the correlation. Because the optical signal transmission speed is high, and the photoelectric device has low power consumption, the operation result of the convolution kernel can be obtained in less time and less power consumption.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a convolution kernel calculation accelerator according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a convolution kernel calculation method according to an embodiment of the present invention;
fig. 3 is a schematic diagram illustrating a method for selecting dispersion delays of a first dispersion medium and a second dispersion medium according to an embodiment of the present invention;
FIG. 4 is a flowchart of a convolution kernel calculation method according to an embodiment of the present invention;
fig. 5 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a convolution kernel calculation accelerator according to an embodiment of the present invention, and a convolution kernel calculation accelerator 10 shown in fig. 1 includes: the optical signal processing device comprises an optical signal generator 101, a first modulator 102, a first time delay module 103, a second modulator 104, a second time delay module 105, a detector 106 and a signal processing module 107 which are connected in sequence.
The optical signal generator 101 is configured to generate an optical signal;
specifically, after the optical signal generator 101 emits an optical signal, the optical signal sequentially passes through the optoelectronic devices such as the first modulator 102, the second delay module 103, the second modulator 104, and the second delay module 105, when the optical signal is transmitted to the detector 106, the detector 106 samples the optical signal and outputs a current signal, and the signal processing module 107 processes the current signal to obtain a calculation result.
In practical applications, the optical signal generator 101 may be embodied as an optical frequency comb generator, which is used to emit an optical frequency comb. It should be understood that when the optical signal generator 101 is an optical frequency comb generator, the optical signal is an optical frequency comb.
An optical frequency comb, also known as an optical comb, an optical frequency comb, or a frequency comb, refers to a spectrum that is spectrally composed of a series of uniformly spaced frequency components with associated stable phase relationships.
Optical frequency comb generators, also known as optical frequency comb light sources, include, but are not limited to, on-chip micro-ring resonators, mode-locked lasers, and cascaded modulators.
The first modulator 102 is used for loading a first matrix of the image convolution kernel operation on the optical signal, and the second modulator 104 is used for loading a second matrix of the image convolution kernel operation on the optical signal;
specifically, in the process of image processing of an input image, a convolution kernel technique is generally applied, and basic operation units of convolution kernel operation of the input image are two matrixes with the same size, and the two matrixes are respectively a matrix with the specific convolution kernel unit and a matrix with the same size as the convolution kernel unit matrix after the input image is subjected to code conversion. Such that the first modulator 102 may load a first matrix to the optical signal and the second modulator 104 may load a second matrix to the optical signal. It should be understood that the first matrix and the second matrix are loaded at different transmission positions of the optical signal.
In practical applications, the first modulator 102 may apply the first matrix to a plurality of adjacent comb teeth of the optical signal in the form of optical intensity at the predetermined loading interval, and the second modulator 104 may apply the second matrix to a plurality of adjacent comb teeth of the optical signal in the form of optical intensity at the predetermined loading interval. It should be understood that the first matrix and the second matrix are applied to the plurality of adjacent comb teeth of the optical signal in the form of light intensity, and it is understood that the matrix values of the first matrix are converted into the first light intensity values and the matrix values of the second matrix are converted into the second light intensity values. The first matrix and the second matrix are loaded on the plurality of adjacent comb teeth of the optical signal according to the preset loading interval, which can be understood that each matrix numerical value in the first matrix is sequentially loaded on the plurality of adjacent comb teeth of the optical signal according to the preset loading interval, and each matrix numerical value in the second matrix is sequentially loaded on the plurality of adjacent comb teeth of the optical signal according to the preset loading interval.
The first delay module 103 is configured to set a first delay amount of the optical signal, and the second delay module 105 is configured to set a second delay amount of the optical signal, where, under the first delay amount and the second delay amount, when the detector 106 samples the optical signal, there is a target sampling time at which the first matrix and the second matrix can be bit-wise multiplied and summed. Further, the first delay module 103 is specifically configured to set a first dispersion delay between a plurality of adjacent comb teeth of the optical signal, and the second delay module 105 is specifically configured to set a second dispersion delay between a plurality of adjacent comb teeth of the optical signal.
Specifically, when the optical signals with different wavelengths pass through the first delay module 103, the time of the optical signals with different wavelengths passing through the first delay module 103 will be different. Similarly, when the optical signals with different wavelengths pass through the second delay module 105, the time of the optical signals with different wavelengths passing through the second delay module 105 will also generate a difference. Namely, the first delay module 103 and the second delay module 105 cause time delay between the comb of the optical signal. Here, the first delay amount set by the first delay module 103 and the second delay amount set by the second delay module 105 need to satisfy the following condition: when the detector 106 samples the optical signal, there is a target sampling instant at which the first matrix and the second matrix can be bit-wise multiplied and summed. That is, at a target sampling instant, the sampled signal can exhibit the property of bit-wise multiplication of the first matrix and the second matrix.
In general, the first delay amount may be a first integer multiple of the preset loading interval duration, and the second delay amount may be a second integer multiple of the preset loading interval duration. It should be further understood that the first and second delay amounts are not necessarily chosen uniquely, as long as the condition that when the detector 106 samples the optical signal, there is a target sampling instant at which the first and second matrices can be multiplied and summed bitwise.
The first delay module 103 may be a first dispersion medium and the second delay module 105 may be a second dispersion medium. A dispersive medium refers to a medium that is dispersive if its dielectric constant or propagation speed is frequency dependent when a radio wave propagates in the medium. The first and second dispersion media are selected from the group consisting of, but not limited to, dispersive optical fibers and photonic crystal waveguides.
The detector 106 is configured to sample the optical signal and output a target current signal at a target sampling time, where the target current signal has an association relationship with a calculation result of bit-wise multiplication and summation performed on the first matrix and the second matrix;
specifically, when the optical signal is transmitted to the detector 106, the detector 106 may sample a plurality of adjacent comb teeth of the optical signal, and at the target sampling time, the target current signal sampled by the detector 106 satisfies the following condition: the target current signal has an association relation with the calculation results of bit-wise multiplication and summation of the first matrix and the second matrix. That is, at a target sampling time, the sampled signal can exhibit a bitwise multiplication of the first matrix and the second matrix, and the target current signal output by the detector 106 can exhibit a current magnitude that is linearly related to the result of the bitwise multiplication and summation of the first matrix and the second matrix.
The signal processing module 107 is configured to convert the target current signal into a calculation result according to the association relationship.
Specifically, after the detector 106 outputs the current signal, the signal processing module may perform signal compensation on the target current signal, and then output the calculation result according to the mapping manner of the association relationship. It should be understood that the calculation result is a calculation result of bit-wise multiplying and summing the first matrix and the second matrix.
An optical connection structure is adopted among the optical signal generator 101, the first modulator 102, the first time delay module 103, the second modulator 104, the second time delay module 105 and the detector 106 in the convolution kernel calculation accelerator 10, and an electrical connection structure is adopted between the detector 106 and the signal processing module 107.
According to the convolution kernel calculation accelerator 10 provided by the embodiment of the present invention, after the optical signal is transmitted through each photoelectric device, the current signal having an association relationship with the operation result of the convolution kernel can be obtained, so that the operation result of the convolution kernel corresponding to the current signal can be obtained according to the association relationship. Because the optical signal transmission speed is high, and the photoelectric device has low power consumption, the operation result of the convolution kernel can be obtained in less time and less power consumption.
Referring to fig. 2, fig. 2 is a schematic diagram of a convolution kernel calculation method according to an embodiment of the present invention, specifically, taking convolution kernel operations of two 3 × 3 matrices as an example, X and Y are respectively expressed as:
Figure BDA0002541933080000081
the convolution kernel operation result can be expressed as:
Figure BDA0002541933080000082
when the convolution kernel calculation accelerator works, the convolution kernel calculation accelerator mainly comprises five steps of optical comb (optical frequency comb) generation, X matrix loading and dispersion time delay, instantaneous sampling, Y matrix loading and dispersion time delay and instantaneous sampling, specifically, the number of working comb teeth of the optical frequency comb is determined according to the matrix scale, the number of the working comb teeth in the embodiment is 9, and the corresponding wavelengths of the working comb teeth are lambda 1 to lambda 9 respectively. Then the first modulator loads the matrix X to 9 comb teeth in the form of light intensity, and the signal loading interval from X1 to X9 is delta t 1 . At the same time, all samples are takenThe strength of the comb teeth is the same. When the optical frequency combs modulated by the first modulator pass through the first dispersion medium, the time of the optical frequency combs with different wavelengths passing through the first dispersion medium will be different due to the dispersion effect, and when the dispersion time delay t1 between adjacent combs is Δ t 1 The comb sampled by the detectors at the same time will shift the signal when the dispersion delay t1 is a certain proper value, and the comb sampled by the detectors at the same time will present the information of the whole input matrix, i.e. the comb intensities presented by λ 1 to λ 9 contain all the values in the X matrix. At the moment, the optical frequency comb passes through the second modulator and the second dispersion medium, the working principle of the optical frequency comb is completely the same as that of the first modulator and the first dispersion medium, through loading the matrix Y and selecting proper dispersion time delay t2, the comb teeth obtained by sampling at a certain time at the input end of the detector present the information of bit-by-bit multiplication of the X and Y matrixes, and the intensity of the comb teeth presented by the lambda 1 to the lambda 9 contains all sub-elements in the bit-by-bit multiplication result of the X and Y matrixes, namely X1. Y1 to X9. Y9, and the output current intensity of the output end of the detector is directly linearly related to the convolution kernel operation result X. Y, so that the calculation of the convolution kernel is realized.
Referring to fig. 3, fig. 3 is a schematic diagram of a method for selecting dispersion delays of a first dispersion medium and a second dispersion medium according to an embodiment of the present invention, where fig. 3 shows comb intensities obtained at different wavelength positions at each sampling time, the comb intensities correspond to matrix values, and sampling intervals are the same as signal loading intervals. It can be seen that the comb tooth intensities presented at the same sampling time λ 1 to λ 9 after the X matrix loading are the same, and different times will vary periodically in the order of X1 to X9. When the dispersion delay of the first dispersion medium is set to t ═ 2 Δ t, all information of the X matrix will be displayed by the comb tooth intensities displayed at the same sampling times λ 1 to λ 9 after passing through the first dispersion medium, and when the sampling time t in the figure is taken as an example, the comb tooth intensities displayed at λ 1 to λ 9 correspond to (X · Y ), and when the dispersion delay of the second dispersion medium is set to t ═ Δ t, the comb tooth intensities displayed at a certain sampling time λ 1 to λ 9 after passing through the second dispersion medium will display information of X, Y two matrices by bit, and when the sampling time t is taken, the comb tooth intensities displayed at λ 1 to λ 9 correspond to (X · Y, x4 · Y4), that is, when the dispersion delay t1 and the dispersion delay t2 are-2 Δ t and Δ t, respectively, at the sampling time t5, the convolution kernel calculation accelerator completes the bitwise multiplication operation of the two matrices, and the convolution result of the collected matrices is X1Y1+ X2Y2+ X3Y3. + X9Y9, at this time, the output signal of the detector will also be the result of the expected convolution kernel operation. In practical operation, the positive and negative dispersion delays can be realized by using dielectric materials with anomalous dispersion properties and normal dispersion properties respectively. The method for selecting the dispersion time delay t1 and t2 adopted by the scheme is not a unique solution, and the selection of the dispersion time delay which meets the output expectation is a reasonable scheme.
Referring to fig. 4, an embodiment of the present invention discloses a convolution kernel calculation method, including:
401. loading a first matrix of image convolution kernel operations onto a first optical signal;
in the process of processing an input image, a convolution kernel technique is usually applied, and basic operation units of convolution kernel operation of the input image are two matrixes with the same size, that is, a first matrix and a second matrix of convolution kernel operation of the image can be two matrixes with the same size obtained by matrix conversion of pixels of the input image.
The convolution kernel computation accelerator loads the first matrix to the first optical signal, i.e., loads the first matrix from the electrical domain to the optical domain.
402. Carrying out first time delay processing on the first optical signal according to a preset first time delay amount to obtain a second optical signal;
at a certain transmission position of the first optical signal transmission, the convolution kernel calculation accelerator can perform time delay processing on the first optical signal by setting a first time delay module, wherein the first time delay module provides a first time delay amount.
Generally, the time of the optical signals with different wavelengths passing through the first time delay module will generate differences.
403. Loading a second matrix of image convolution kernel operations onto a second optical signal;
the convolution kernel computation accelerator loads the second matrix to the second optical signal, i.e., loads the second matrix from the electrical domain to the optical domain.
It should be understood that the first matrix and the second matrix are loaded at different transmission positions of the optical signal.
404. Carrying out second time delay processing on the second optical signal according to a preset second time delay amount to obtain a third optical signal; under the first time delay amount and the second time delay amount, when the third optical signal is sampled, a target sampling moment exists at which the first matrix and the second matrix can be multiplied and summed according to bits;
at a certain transmission position of the second optical signal transmission, the convolution kernel calculation accelerator may perform time delay processing on the second optical signal by setting a second time delay module, where the second time delay module provides a second time delay amount.
The first delay amount and the second delay amount need to satisfy the following conditions: when the convolution kernel computation accelerator samples the third optical signal, there is a target sampling instant at which the first matrix and the second matrix can be bit-wise multiplied and summed. That is, at a target sampling instant, the sampled signal can exhibit the property of bit-wise multiplication of the first matrix and the second matrix.
405. Sampling the third optical signal, and outputting a target current signal at a target sampling moment, wherein the target current signal has an incidence relation with a calculation result of bit-wise multiplication and summation of the first matrix and the second matrix;
at the target sampling moment, a target current signal obtained by sampling of a convolution kernel calculation accelerator meets the following conditions: the target current signal has an association relation with the calculation result of bit-wise multiplication and summation of the first matrix and the second matrix. That is, at a target sampling time, the sampled signal can exhibit a property of bit-wise multiplication of the first matrix and the second matrix, and the target current signal output by the convolution kernel calculation accelerator can exhibit a current magnitude linearly related to the calculation result of bit-wise multiplication and summation of the first matrix and the second matrix.
406. And converting the target current signal into a calculation result according to the incidence relation.
The convolution kernel calculation accelerator can perform signal compensation on the target current signal, and then outputs a calculation result according to a mapping mode of the incidence relation. It should be understood that the calculation result is a calculation result of bit-wise multiplying and summing the first matrix and the second matrix.
For a specific description of the convolution kernel calculation method according to the embodiment of the present invention, reference may be made to the above-mentioned scheme of the convolution kernel calculation accelerator, and details are not described here.
According to the convolution kernel calculation method provided by the embodiment of the invention, after the optical signal is transmitted by each photoelectric device, the current signal having the correlation with the operation result of the convolution kernel can be obtained, so that the operation result of the convolution kernel corresponding to the current signal can be obtained according to the correlation. Because the optical signal transmission speed is high, and the photoelectric device has low power consumption, the operation result of the convolution kernel can be obtained in less time and less power consumption.
Fig. 5 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 5: a processor (processor)501, a communication Interface (Communications Interface)502, a memory (memory)503, a controller 504 and a communication bus 505, wherein the processor 501, the communication Interface 502, the memory 503 and the controller 504 are communicated with each other via the communication bus 505. The controller 504 is used to control the convolution kernel computation accelerator, and the processor 501 may call the logic instructions in the memory 503 to perform the following method: loading a first matrix of image convolution kernel operations onto a first optical signal; performing first time delay processing on the first optical signal according to a preset first time delay amount to obtain a second optical signal; loading a second matrix of the image convolution kernel operation onto the second optical signal; performing second time delay processing on the second optical signal according to a preset second time delay amount to obtain a third optical signal; wherein, under the first and second time delay amounts, when sampling the third optical signal, there is a target sampling time at which the first and second matrices can be bit-wise multiplied and summed; sampling the third optical signal, and outputting a target current signal at the target sampling moment, wherein the target current signal has an incidence relation with the calculation results of bit-wise multiplication and summation of the first matrix and the second matrix; and converting the target current signal into the calculation result according to the incidence relation. In addition, the logic instructions in the memory 503 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented by a processor to perform the method provided by the foregoing embodiments, for example, including: loading a first matrix of image convolution kernel operations onto a first optical signal; carrying out first time delay processing on the first optical signal according to a preset first time delay amount to obtain a second optical signal; loading a second matrix of the image convolution kernel operation onto the second optical signal; performing second time delay processing on the second optical signal according to a preset second time delay amount to obtain a third optical signal; wherein, under the first and second time delay amounts, when sampling the third optical signal, there is a target sampling time at which the first and second matrices can be bit-wise multiplied and summed; sampling the third optical signal, and outputting a target current signal at the target sampling moment, wherein the target current signal has an incidence relation with the calculation results of bit-wise multiplication and summation of the first matrix and the second matrix; and converting the target current signal into the calculation result according to the incidence relation.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and can of course be implemented by hardware or a special ASIC chip. With this understanding, the above technical solutions substantially or contributing to the prior art may be embodied in the form of a software product and a hardware architecture, where the computer software product may be stored in a computer-readable storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A convolution kernel computation accelerator, comprising: the optical signal generator, the first modulator, the first time delay module, the second modulator, the second time delay module, the detector and the signal processing module are connected in sequence;
the optical signal generator is used for emitting optical signals;
the first modulator is used for loading a first matrix of image convolution kernel operation on the optical signal, and the second modulator is used for loading a second matrix of the image convolution kernel operation on the optical signal;
the first time delay module is configured to set a first time delay amount of the optical signal, and the second time delay module is configured to set a second time delay amount of the optical signal, where under the first time delay amount and the second time delay amount, when the detector samples the optical signal, there is a target sampling time at which the first matrix and the second matrix can be bit-wise multiplied and summed;
the detector is used for sampling the optical signal and outputting a target current signal at the target sampling moment, wherein the target current signal has an incidence relation with the calculation results of bit-wise multiplication and summation of the first matrix and the second matrix;
the signal processing module is used for converting the target current signal into the calculation result according to the incidence relation.
2. The convolution kernel computation accelerator of claim 1, wherein the first modulator is configured to load a first matrix of the image convolution kernel onto a plurality of adjacent comb teeth of the optical signal in a light intensity format at a preset loading interval, and the second modulator is configured to load a second matrix of the image convolution kernel onto the plurality of adjacent comb teeth of the optical signal in the light intensity format at the preset loading interval.
3. The convolution kernel computation accelerator of claim 1, wherein the first delay module is configured to set a first dispersion delay between a plurality of adjacent comb teeth of the optical signal, and the second delay module is configured to set a second dispersion delay between a plurality of adjacent comb teeth of the optical signal.
4. The convolution kernel computation accelerator of claim 1, wherein the detector is specifically configured to sample a plurality of adjacent comb fingers of the optical signal.
5. The convolution kernel computation accelerator of claim 1, wherein the optical signal generator comprises an optical frequency comb generator, the optical frequency comb generator configured to emit an optical frequency comb.
6. The convolution kernel computation accelerator of claim 1, wherein the first delay module is a first dispersion medium and the second delay module is a second dispersion medium.
7. The convolution kernel computation accelerator of claim 6, wherein the optical connection structure is used among the optical signal generator, the first modulator, the first dispersion medium, the second modulator, the second dispersion medium, and the detector, and the electrical connection structure is used between the detector and the signal processing module.
8. A method of convolution kernel computation, the method comprising:
loading a first matrix of image convolution kernel operations onto a first optical signal;
carrying out first time delay processing on the first optical signal according to a preset first time delay amount to obtain a second optical signal;
loading a second matrix of the image convolution kernel operation onto the second optical signal;
performing second time delay processing on the second optical signal according to a preset second time delay amount to obtain a third optical signal; wherein, under the first and second time delay amounts, when sampling the third optical signal, there is a target sampling time at which the first and second matrices can be bit-wise multiplied and summed;
sampling the third optical signal, and outputting a target current signal at the target sampling moment, wherein the target current signal has an incidence relation with the calculation results of bit-wise multiplication and summation of the first matrix and the second matrix;
and converting the target current signal into the calculation result according to the incidence relation.
9. An electronic device comprising a memory, a processor, a controller and a computer program stored on the memory and executable on the processor, wherein the controller is configured to control a convolution kernel calculation accelerator according to any one of claims 1 to 7, and the processor implements the steps of the convolution kernel calculation method according to claim 8 when executing the program.
10. A non-transitory computer readable storage medium, having stored thereon a computer program, which, when being executed by a processor, carries out the steps of the convolution kernel calculation method according to claim 8.
CN202010549461.XA 2020-06-16 2020-06-16 Convolution kernel calculation accelerator and convolution kernel calculation method Active CN111899147B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010549461.XA CN111899147B (en) 2020-06-16 2020-06-16 Convolution kernel calculation accelerator and convolution kernel calculation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010549461.XA CN111899147B (en) 2020-06-16 2020-06-16 Convolution kernel calculation accelerator and convolution kernel calculation method

Publications (2)

Publication Number Publication Date
CN111899147A CN111899147A (en) 2020-11-06
CN111899147B true CN111899147B (en) 2022-08-09

Family

ID=73207676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010549461.XA Active CN111899147B (en) 2020-06-16 2020-06-16 Convolution kernel calculation accelerator and convolution kernel calculation method

Country Status (1)

Country Link
CN (1) CN111899147B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113517932B (en) * 2021-04-29 2022-07-08 北京理工大学 Optical convolution signal processing system and method based on cascade modulator
CN115130666B (en) * 2022-08-31 2022-11-22 之江实验室 Two-dimensional photon convolution acceleration method and system
CN116128702B (en) * 2022-12-20 2024-03-22 中国人民解放军军事科学院国防科技创新研究院 Photon integrated computing chip based on frequency division multiplexing technology and image processing method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106530210B (en) * 2016-10-31 2019-09-06 北京大学 The device and method that parallel-convolution calculates are realized based on resistive memory array
CN108229645B (en) * 2017-04-28 2021-08-06 北京市商汤科技开发有限公司 Convolution acceleration and calculation processing method and device, electronic equipment and storage medium
CN109871510B (en) * 2019-01-08 2024-01-23 广东浪潮大数据研究有限公司 Two-dimensional convolution operation processing method, system, equipment and computer storage medium

Also Published As

Publication number Publication date
CN111899147A (en) 2020-11-06

Similar Documents

Publication Publication Date Title
CN111899147B (en) Convolution kernel calculation accelerator and convolution kernel calculation method
US11604978B2 (en) Large-scale artificial neural-network accelerators based on coherent detection and optical data fan-out
CN111882052B (en) Photon convolution neural network system
Alaghi et al. Fast and accurate computation using stochastic circuits
CN103678258B (en) Method for improving data resolution ratio of silica-based optical matrix processor
JP7379821B2 (en) Inference processing device and inference processing method
US10627849B1 (en) Reservoir computing operations using multi-mode photonic integrated circuits
CN115169542B (en) Two-dimensional photon convolution acceleration system and device for convolution neural network
CN115222035B (en) Photon neural network convolution acceleration chip
CN108629403B (en) Processing signal saturation in impulse neural networks
CN114037070A (en) Optical signal processing method, photonic neural network chip and design method thereof
De Marinis et al. A codesigned integrated photonic electronic neuron
CN114565091A (en) Optical neural network device, chip and optical implementation method for neural network calculation
Meng et al. On-demand reconfigurable incoherent optical matrix operator for real-time video image display
Wang et al. Development and Application of an Integrated Laser-Enabled Silicon Photonic Tensor Core
Chen et al. Iterative photonic processor for fast complex-valued matrix inversion
CN113312023B (en) Photoelectric mixed multiplier
FR2648585A1 (en) METHOD AND DEVICE FOR RAPID MULTIPLICATION OF COMPLEMENT A 2 CODES IN A DIGITAL SIGNAL PROCESSING SYSTEM
Gao et al. Reservoir computing using arrayed waveguide grating
CN114742219A (en) Neural network computing method and photonic neural network chip architecture
US11436302B2 (en) Electronic system for computing items of an outer product matrix
De Marinis et al. Analysis of Integration Technologies for High-Speed Analog Neuromorphic Photonics
Stanco et al. Certification of the efficient random number generation technique based on single‐photon detector arrays and time‐to‐digital converters
CN116484931B (en) Photon matrix multiplication operation device and operation method for neural network
US20230387968A1 (en) Photonic computing system and method for wireless communication signal processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant