WO2020135093A1 - 卷积神经网络处理方法、装置、设备及存储介质 - Google Patents

卷积神经网络处理方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2020135093A1
WO2020135093A1 PCT/CN2019/125047 CN2019125047W WO2020135093A1 WO 2020135093 A1 WO2020135093 A1 WO 2020135093A1 CN 2019125047 W CN2019125047 W CN 2019125047W WO 2020135093 A1 WO2020135093 A1 WO 2020135093A1
Authority
WO
WIPO (PCT)
Prior art keywords
matrix
input neuron
original
transformation
target
Prior art date
Application number
PCT/CN2019/125047
Other languages
English (en)
French (fr)
Inventor
易松松
熊祎
Original Assignee
广州市百果园信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州市百果园信息技术有限公司 filed Critical 广州市百果园信息技术有限公司
Publication of WO2020135093A1 publication Critical patent/WO2020135093A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Definitions

  • Embodiments of the present application relate to deep learning technology, for example, to a convolutional neural network processing method, device, equipment, and storage medium.
  • Embodiments of the present application provide a convolutional neural network processing method, device, equipment, and storage medium to improve the calculation speed of the convolutional neural network.
  • An embodiment of the present application provides a convolutional neural network processing method.
  • the method includes:
  • the output neuron matrix of the convolutional neural network is obtained according to the target weight matrix and the target input element matrix.
  • An embodiment of the present application also provides a convolutional neural network processing device, which includes:
  • the original matrix acquisition module is set to acquire the original weight matrix and original input neuron matrix of the convolutional neural network
  • the target matrix generation module is configured to sequentially perform Winograd transform and quantization on the original weight matrix to obtain a target weight matrix, and sequentially perform quantization and Winograd transform on the original input neuron matrix to obtain target input nerves Meta matrix
  • the output neuron matrix generation module is configured to obtain the output neuron matrix of the convolutional neural network according to the target weight matrix and the target input neuron matrix.
  • An embodiment of the present application also provides a device, which includes:
  • One or more processors are One or more processors;
  • Memory set to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the method provided by the embodiments of the present application.
  • An embodiment of the present application further provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the method as provided in the embodiment of the present application is implemented.
  • FIG. 1 is a convolutional neural network processing method provided by an embodiment of the present application
  • FIG. 2 is another processing method of a convolutional neural network provided by an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of a convolutional neural network processing device provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a device in an embodiment of the present invention.
  • Convolutional neural networks are generally composed of the following three parts, the first part is the input layer, the second part is composed of the convolution layer, the activation layer and the pooling layer (or downsampling layer), and the third part is composed of a fully connected Layer perceptron classifier (that is, fully connected layer).
  • the convolutional layer is responsible for feature extraction and uses two key concepts: receiving domain and weight sharing; the pooling layer performs local averaging and subsampling to reduce the sensitivity of the feature to offset and distortion. This is because the precise location of the feature is The secondary position is more important than the relative position of other features; the fully connected layer performs classification.
  • the convolutional layer is the core of the convolutional neural network.
  • the convolutional layer is composed of some two-dimensional neuron faces called feature maps.
  • Each neuron on a feature map and a small subset of neurons in the previous layer (called Receiving domain) connection the local receiving domain on the two-dimensional space allows the convolutional neural network to extract primary features from the input.
  • Neurons in a feature map share one or more weight matrices (or convolution kernels). Weight sharing not only greatly reduces the number of weights that need to be trained, but also greatly reduces the amount of training data required, and reduces the complexity of the neural network and alleviates the problem of overfitting.
  • the above feature extraction process is the convolution operation process.
  • the feature map can be understood as a neuron matrix.
  • the feature map of the current layer without feature extraction can be used as the input neuron matrix
  • the current layer is subjected to feature extraction (that is, convolution operation)
  • the feature map is used as the output neuron matrix.
  • the output neuron matrix of the current layer is the input neuron matrix of the next layer.
  • the convolution operation of the convolution layer occupies most of the calculation of the convolutional neural network.
  • the calculation amount refers to the number of operations of the network, and the number of operations of the network can be expressed by time complexity.
  • the number of multiplication operations required to obtain each output neuron matrix is W out *H out *F*F
  • the multiplication operation in the convolution operation is convolution Part of the time complexity of neural networks.
  • the premise of using non-time-consuming arithmetic operations instead of time-consuming arithmetic operations to reduce the complexity of convolutional neural networks is that the clock cycle of time-consuming operations is greater than that of non-time-consuming operations.
  • the above implementation method improves the calculation speed of the convolutional neural network by reducing the result accuracy of the convolutional neural network. If the error of the result obtained by the above processing method is within a preset error range, then the above processing method may be considered feasible. If the error of the result obtained by the above processing method is not within the preset range, then even if the above processing method is used to increase the calculation speed of the convolutional neural network, the above processing method will be considered infeasible.
  • the premise of whether the above processing method is feasible is that the error of the result obtained by the above processing method is within a preset error range. Accordingly, in order to ensure that the above processing method is feasible, it is necessary to consider how to reduce the error of the result, that is, to improve the accuracy of the result. The above will be described below in conjunction with specific embodiments.
  • FIG. 1 is a flowchart of a convolutional neural network processing method provided by an embodiment of the present application. This embodiment can be applied to a case where the calculation speed of a convolutional neural network is improved.
  • the method can be executed by a convolutional neural network determination device
  • the device may be implemented in the form of software and/or hardware.
  • the device may be configured in the device, for example, in a computer or mobile terminal. As shown in Figure 1, the method includes the following steps:
  • Step 110 Obtain the original weight matrix and original input neuron matrix of the convolutional neural network.
  • Step 120 Perform a Winograd transform and quantization process on the original weight matrix in sequence to obtain a target weight matrix.
  • Step 130 Perform quantization processing and Winograd transform on the original input neuron matrix in sequence to obtain the target input neuron matrix.
  • the Winograd transform may be used to reduce the complexity of the convolutional neural network. The Winograd transformation will be described below.
  • Winograd transform is a widely used matrix serial multiplication.
  • the idea of Winograd transform is to add a small number of addition operations and reduce the multiplication operations.
  • Output neuron transformation matrix A the original weight of the original transformation matrix G input neurons transformation matrix B T may be determined according to the preset parameters Winograd transformation, Winograd transformation parameters preset herein refers to the output neurons of the matrix M out Dimension and dimension of original weight matrix N, output neuron matrix M out is W out *H out matrix, original weight matrix N is F*F matrix, namely original weight transformation matrix G and original input nerve
  • the meta transformation matrix B T can be determined according to F (W out *H out , F*F).
  • the number of multiplications ⁇ (F(W out *H out required to complete the convolution operation based on Winograd transform , F*F)) (W out +F-1)(H out +F-1), that is, after performing the Winograd transform on the original input neuron matrix W in and the original weight matrix N, according to the transformed original input
  • the Winograd transform can reduce the number of multiplication operations of the convolutional neural network, that is, the Winograd transform can reduce the time complexity of the convolutional neural network.
  • the training process of the convolutional neural network is a process of continuously adjusting the weights, this adjustment of the weights generally requires floating-point precision to complete. Therefore, the elements of the original weight matrix and the original input neuron matrix are usually It is stored as a floating-point number. However, floating-point operations consume a relatively large amount of computing resources. If the accuracy of the results of the convolutional neural network is not affected, the accuracy of the results of the convolutional neural network described here refers to the convolution The error of the result obtained after processing by the neural network is within the preset error range. If other simple numerical types can be used for calculation, the calculation speed of the convolutional neural network will be greatly improved, and the power consumption can also be reduced.
  • Quantization refers to the data compression of the original weight matrix and the original input neuron matrix using the redundancy of floating-point arithmetic accuracy. Since data compression does not change the network structure, it usually does not require retraining. The reason why the original weight matrix of the convolutional neural network and the original input neuron matrix can be quantified is that the element values of the original weight matrix of the convolution operation and the original input neuron matrix generally exist in the form of 32-bit floating-point numbers.
  • the general-purpose central processor or digital signal processor can support a minimum of 8 fixed-point numbers, and a large number of experiments show that the convolutional neural network is not sensitive to the data bit width, that is, the data bit width of the convolutional neural network is reduced
  • the impact on the result is small, therefore, the original weight matrix and the original input neuron matrix can be quantized, that is, the element values of the original weight matrix and the original input neuron matrix are quantized from 32 bits to 8 bits.
  • the 8-bit fixed-point number mentioned here refers to an 8-bit fixed-point integer, that is, an 8-bit binary number. Compared with floating-point numbers, the 8-bit numeric access reduces the memory bandwidth to 25%.
  • the above can make better use of cache and avoid random access memory (Random Access Memory, RAM) access bottlenecks.
  • SIMD Single Instruction Multiple Data
  • the quantization method may include a maximum and minimum quantization method, an information entropy quantization method, and the like.
  • entropy is a measure of information uncertainty. The greater the entropy, the higher the uncertainty of the information.
  • the maximum and minimum quantization method is: in the case of quantizing 32-bit floating point to 8-bit fixed point, determine the maximum floating point value a and minimum floating point value b of the parameter, according to the maximum floating point value a and the minimum floating point value b Determine the parameter distribution interval [a, b], and map all parameters linearly to the closest one of the 255 8-bit fixed-point numbers between the distribution interval [a, b].
  • the parameters described here are the original weights The elements of the value matrix and the elements in the original input neuron.
  • nonlinear mapping can also be used to map all parameters to the closest one of the 255 8-bit fixed-point numbers between the distribution intervals [a, b].
  • the distribution interval of the elements of the original weight matrix is [-3, 5]
  • a linear mapping method is used to map all the elements in the original weight matrix to 255 8 between the distribution interval [-3, 5]
  • the closest one of the fixed-point numbers can be mapped from -3 to -127 and from 5 to 127.
  • the actual situation described here mainly refers to: because the distribution range of the elements in the original weight matrix is not the same as the distribution range of the elements in the original input neuron matrix, the original weight The distribution range of elements in the value matrix is relatively concentrated, and the distribution range of elements in the original input neuron matrix is relatively wide. Therefore, the sensitivity of the original weight matrix and the original input neuron matrix to the quantization bit width is not the same.
  • the order of execution to reduce the time complexity of the convolutional neural network and the data bit width of the convolutional neural network, that is, the order of performing the Winograd transform and the quantization process.
  • the threshold mentioned here can refer to the maximum floating point of the element Value and minimum floating point value.
  • most of the elements in the original weight matrix can be included, so that the elements in the original weight matrix are less lost, that is, most of the elements can be used, and the It maps to a smaller distribution interval.
  • the error caused by quantization mainly comes from the accuracy of rounding loss.
  • the Winograd transformation can be performed on the original weight matrix first, and then the quantization process can be performed on the original weight matrix after the Winograd transformation. If the quantization process is performed on the original weight matrix first, and then the Winograd transform is performed on the quantized original weight matrix, then the quantization process will accumulate and amplify the errors generated by the elements in the original weight matrix during quantization and rounding, therefore, Will affect the results of convolutional neural networks.
  • the threshold described here can refer to the maximum value of the element Floating point value and minimum floating point value.
  • most elements in the original input neuron matrix may not be included, so that the elements in the original input neuron matrix are lost more, that is, most elements may not be utilized .
  • a large number of experiments show that for the original input neuron matrix, each element in the original input neuron matrix is quantized from 32 bits to 8 bits, and the effect of information entropy quantization method is better, and in the time domain The effect of processing is better than that of frequency domain processing.
  • the original input neuron matrix because the effect of quantization in the time domain is better than the effect of quantization in the frequency domain, and the Winograd transform is to convert the data from the time domain to the frequency domain, therefore, the original input The neuron matrix performs quantization processing, and then performs Winograd transformation on the original input neuron matrix after quantization processing.
  • the original weight matrix of the convolutional neural network in the embodiment of the present application represents a weight matrix that has not been subjected to Winograd transformation and quantization
  • the original input neuron matrix of the convolutional neural network represents that has not been subjected to quantization and Input neuron matrix of Winograd transform.
  • Perform the Winograd transform and quantization process on the original weight matrix in turn to obtain the target weight matrix which can be understood as follows: first perform the Winograd transform on the original weight matrix to obtain the Winograd transform weight matrix, and then quantize the Winograd transform weight matrix Processing to get the target weight matrix.
  • the maximum-minimum quantization method or the information entropy quantization method can be used to quantize the Winograd transform weight matrix. Which quantization method is used can be set according to the actual situation and is not limited herein.
  • the original input neuron matrix is sequentially quantized and Winograd transformed to obtain the target input neuron matrix, which can be understood as follows: first quantize the original input neuron matrix to obtain the quantized input neuron matrix, and then quantize the input neuron matrix The rectangle is subjected to Winograd transformation to obtain the target input neuron matrix.
  • the information entropy method can be used to quantize the original input neuron matrix.
  • Step 140 Obtain the output neuron matrix of the convolutional neural network according to the target weight matrix and the target input neuron matrix.
  • the Winograd transform transforms the matrix from the time domain to the frequency domain
  • the convolution operation performed by the original weight matrix and the original target input neuron matrix in the time domain is converted to the target in the frequency domain
  • the dot product performed by the weight matrix and the target input neuron matrix is the result of the frequency domain convolutional neural network, and then the output neuron matrix can be subjected to Winograd transform to convert it into a time domain convolution
  • the result of the product neural network is the original input neuron matrix as the next layer.
  • Steps 120 and 130 can be executed synchronously, or step 120 can be executed first, and then step 130 can be executed. Step 130 can be executed first, and then step 120 can be executed. The order of execution can be determined according to the actual situation, and is not limited herein.
  • the original weight matrix is sequentially subjected to Winograd transformation and quantization to obtain the target weight matrix, and the original input nerve
  • the meta matrix is sequentially quantized and Winograd transformed to obtain the target input neuron matrix.
  • the output neuron matrix of the convolutional neural network is obtained, and the convolutional neural network is reduced by the Winograd transform.
  • the time complexity and the data width of the convolutional neural network are reduced through quantization, which in turn improves the calculation speed of the convolutional neural network.
  • the original weight matrix is sequentially subjected to Winograd transform and quantization to obtain a target weight matrix, which may include: obtaining a target weight transform matrix.
  • the target weight transformation matrix the original weight matrix is subjected to Winograd transformation to obtain the Winograd transformation weight matrix.
  • the weight matrix of Winograd transform is quantized to obtain the target weight matrix.
  • the original weight matrix is sequentially subjected to Winograd transform and quantization to obtain a target weight matrix, which may include: obtaining the target weight transform matrix, and performing the original weight matrix according to the target weight transform matrix Winograd transform, get the Winograd transform weight matrix, and then quantize the Winograd weight transform matrix to get the target weight matrix, the target weight transform matrix described here can be determined according to the preset Winograd transform parameters, the preset here
  • the Winograd transformation parameters refer to the dimension of the output neuron matrix M out and the dimension of the original weight matrix N.
  • the output neuron matrix M out is a matrix of W out *H out
  • the original weight matrix N is a matrix of F*F, that is, the target weight transformation matrix is based on F(W out *H out , F*F )determine.
  • the target weight transformation matrix is
  • the original input neuron matrix is sequentially subjected to quantization processing and Winograd transformation to obtain the target input neuron matrix, which may include: obtaining the target input neuron transformation matrix.
  • the original input neuron matrix is quantized to obtain a quantized input neuron matrix.
  • Winograd transform is performed on the quantized input neuron matrix to obtain the target input neuron matrix.
  • the original input neuron matrix is sequentially quantized and Winograd transformed to obtain the target input neuron matrix, which may include: obtaining the target input neuron transformation matrix, and quantizing the original input neuron matrix to obtain Quantize the input neuron matrix, and then perform the Winograd transformation on the quantized input neuron matrix according to the target input neuron transformation matrix to obtain the target input neuron matrix.
  • the target input neuron transformation matrix described here can be determined according to the preset Winograd transformation parameters
  • the preset Winograd transformation parameters mentioned here refer to the dimension of the output neuron matrix M out and the dimension of the original weight matrix N.
  • the output neuron matrix M out is a matrix of W out *H out
  • the original weight matrix N is a matrix of F*F
  • the target input neuron transformation matrix is based on F(W out *H out , F* F) OK.
  • the target input neuron transformation matrix is
  • the target weight transformation matrix and the target input neuron transformation matrix can be determined as follows: the original weight transformation matrix and the original input neuron transformation matrix are determined according to the preset Winograd transformation parameters, and the original weight transformation The number of rows of the matrix is equal to the number of rows of the original input neuron transformation matrix.
  • ReLU modified linear unit
  • the value range [0,127], that is, beyond the 8-bit storage range, the above can be understood as an overflow problem in the process of quantizing the original input neuron matrix. In order to solve the above problems, it may be considered to adjust the transformation matrix involved in Winograd transformation.
  • Output neuron transformation matrix A the original weight of the original transformation matrix G input neurons transformation matrix B T may be determined according to the preset parameters Winograd transformation, Winograd transformation parameters preset herein refers to the output neurons of the matrix M out Dimension and dimension of original weight matrix N, output neuron matrix M out is W out *H out matrix, original weight matrix N is F*F matrix, namely original weight transformation matrix G and original input nerve
  • the meta transformation matrix B T can be determined according to F(W out *H out , F*F), and the number of rows of the original weight transformation matrix G is equal to the number of rows of the original input neuron transformation matrix B T.
  • the output neuron matrix of the convolutional neural network is the result of dot multiplication of the target weight matrix and the target input neuron matrix
  • the original weight A plurality of element values of the corresponding i-th row in the value transformation matrix are divided by a preset ratio value, respectively, to obtain a target input neuron transformation matrix and a target weight transformation matrix.
  • the original input neuron transformation matrix is the target input Neuron transformation matrix.
  • the single instruction vrhadd in SIMD can be directly used to implement the shift operation, and the above will not cause additional instruction overhead.
  • the preset ratio value is less than 1
  • i is a positive integer less than or equal to n
  • n is The number of rows of the original input neuron transformation matrix makes it possible to realize that the element values of the target input neuron matrix are in the value range [0 , 127], which solves the overflow problem in the process of quantizing the original input neuron matrix.
  • the original input neuron transformation matrix can be determined as The original weight transformation matrix is Suppose the quantized input neuron matrix is When B T and the quantization matrix input neurons for product operation, B T in the second row element and the quantization matrix input neuron operation result is b + c, wherein b and c are due to the [0,127] Between the numbers, there may be a situation where b+c is greater than 127, that is, the overflow problem.
  • the preset ratio value is set Multiply the element values of the second row in the original input neuron transformation matrix B T by the preset ratio values, and the element values in the other rows of the original neuron transformation matrix B T except row 2 remain unchanged , Get the target input neuron transformation matrix Divide the multiple elements of the second row of the original weight transformation matrix G by the preset scale value In the original weight transformation matrix G, the element values in the other rows except the second row remain unchanged, and the target weight transformation matrix is obtained.
  • quantizing the original input neuron matrix to obtain a quantized input neuron matrix may include: adding a plurality of element values in the original input neuron matrix to the preset adjustment values respectively, and obtaining the adjusted The original input neuron matrix.
  • the adjusted original input neuron matrix is rounded to zero to obtain a quantized input neuron matrix.
  • a plurality of element values in the original input neuron matrix can be added to the preset adjustment values respectively to obtain the adjusted original input neuron matrix.
  • the adjustment value can be set to 0.5 floating point (float), and the method of rounding to zero is used to quantize the adjusted original input neuron matrix to obtain a quantized input neuron matrix.
  • the value range of multiple elements in the target weight matrix can be [-127, 127].
  • the range of values of multiple elements in the target weight matrix can be [0, 127].
  • the value range of multiple elements in the target input neuron matrix can be [0, 127].
  • the original input neuron matrix is quantized, that is, multiple elements of the original input neuron matrix are quantized from 32 bits to 8 bits respectively, then the quantized input neuron is obtained
  • the value range of the elements of the matrix is [0, 127].
  • the value range of multiple elements in the target input neuron matrix obtained by performing the Winograd transform on the quantized input neuron matrix can be all [0, 127 ].
  • the technical solution provided by the embodiments of the present application reduces the time complexity of the convolutional neural network by using the Winograd transform, and reduces the data bit width of the convolutional neural network by using the quantization method, and according to the original weight matrix and the original The distribution of elements in the input neuron matrix.
  • the Winograd transform is first performed and then the quantization process is performed.
  • the quantization process is performed first and then the Winograd transformation is performed.
  • the above makes the convolutional neural network
  • the network can basically meet the requirements of real-time operation on mobile devices.
  • FIG. 2 is a flowchart of another convolutional neural network processing method provided by an embodiment of the present application. This embodiment may be applicable to a case where the calculation speed of a convolutional neural network is increased.
  • the method may be determined by a convolutional neural network determination device.
  • the device may be implemented in software and/or hardware, and the device may be configured in a device, such as a computer or a mobile terminal. As shown in Figure 2, the method includes the following steps:
  • Step 2010 Obtain the original weight matrix and original input neuron matrix of the convolutional neural network.
  • Step 2020 Determine the original weight transformation matrix and the original input neuron transformation matrix according to the preset Winograd transformation parameters. The number of rows of the original weight transformation matrix and the original input neuron transformation matrix are equal.
  • Step 2030 Determine whether data overflow occurs during the Winograd transformation of the quantized input neuron matrix according to the original input neuron transformation matrix. If data overflow occurs, step 2040 is performed; if data overflow does not occur, step 2050 is performed.
  • Step 2040 Multiply the multiple element values in the i-th row of the original input neuron transformation matrix by the preset ratio values to obtain the new multiple element values in the i-th row of the original input neuron transformation matrix, and divide the original weights
  • the multiple element values of the i-th row in the transformation matrix are divided by the preset ratio values respectively to obtain the new multiple element values of the i-th row in the original weight transformation matrix, and the preset ratio value is less than 1.
  • Step 2050 Keep the element values of the i-th row in the original input neuron transformation matrix unchanged, and keep the element values of the i-th row in the original weight transformation matrix unchanged.
  • Step 2060 Generate a target input neuron transformation matrix and a target weight transformation matrix.
  • Step 2070 Perform a Winograd transformation on the original weight matrix according to the target weight transformation matrix to obtain a Winograd transformation weight matrix.
  • Step 2080 Perform quantization processing on the Winograd transform weight matrix to obtain the target weight matrix.
  • Step 2090 Perform quantization processing on the original input neuron matrix to obtain a quantized input neuron matrix.
  • Step 2100 Perform a Winograd transformation on the quantized input neuron matrix according to the target input neuron transformation matrix to obtain the target input neuron matrix.
  • Step 2110 Obtain the output neuron matrix of the convolutional neural network according to the target weight matrix and the target input neuron matrix.
  • steps 2070-step 2080 and steps 2090-step 2100 may be performed synchronously, or steps 2070-step 2080 may be executed first, then steps 2090-step 2100 may be executed sequentially, or steps 2090 may also be executed first. -Step 2100, and then perform steps 2070 to 2080 in sequence.
  • the order of execution can be determined according to the actual situation and is not limited here.
  • the original weight matrix is sequentially subjected to Winograd transformation and quantization to obtain the target weight matrix, and the original input nerve
  • the meta matrix is sequentially quantized and Winograd transformed to obtain the target input neuron matrix.
  • the output neuron matrix of the convolutional neural network is obtained, and the convolutional neural network is reduced by the Winograd transform.
  • the time complexity and the data width of the convolutional neural network are reduced through quantization, which in turn improves the calculation speed of the convolutional neural network.
  • FIG. 3 is a schematic structural diagram of a convolutional neural network processing device provided by an embodiment of the present application. This embodiment may be applicable to a case where the calculation speed of a convolutional neural network is improved.
  • the device may be implemented in software and/or hardware
  • the device can be configured in the device, such as a computer or mobile terminal. As shown in Figure 3, the device includes:
  • the original matrix obtaining module 310 is configured to obtain the original weight matrix and the original input neuron matrix of the convolutional neural network.
  • the target matrix generation module 320 is set to sequentially perform the Winograd transform and quantization process on the original weight matrix to obtain the target weight matrix, and sequentially perform the quantization process and Winograd transform on the original input neuron matrix to obtain the target input neuron matrix.
  • the output neuron matrix generation module 330 is configured to obtain the output neuron matrix of the convolutional neural network according to the target weight matrix and the target input neuron matrix.
  • the original weight matrix is sequentially subjected to Winograd transformation and quantization to obtain the target weight matrix, and the original input nerve
  • the meta matrix is sequentially quantized and Winograd transformed to obtain the target input neuron matrix.
  • the output neuron matrix of the convolutional neural network is obtained, and the convolutional neural network is reduced by the Winograd transform.
  • the time complexity and the data width of the convolutional neural network are reduced through quantization, which in turn improves the calculation speed of the convolutional neural network.
  • FIG. 4 is a schematic structural diagram of a device provided by an embodiment of the present application.
  • FIG. 4 shows a block diagram of an exemplary device 412 suitable for implementing the present embodiment.
  • the device 412 is represented in the form of a general-purpose computing device.
  • Components of device 412 may include, but are not limited to, one or more processors 416, system memory 428, and bus 418 connected to different system components (including system memory 428 and processor 416).
  • the system memory 428 may include computer system readable media in the form of volatile memory, such as random access memory (Random Access Memory, RAM) 730 and/or cache memory 432.
  • RAM random access memory
  • the storage system 434 can be used to read and write non-removable, non-volatile magnetic media (not shown in FIG. 4 and is commonly referred to as a "hard drive").
  • the program/utility tool 440 having a set (at least one) of program modules 442 may be stored in the memory 428, for example.
  • the device 412 may also communicate with one or more external devices 414 (eg, keyboard, pointing device, display 424, etc.). This communication can be performed via an input/output (I/O) interface 422.
  • I/O input/output
  • the device 412 may also communicate with one or more networks (such as a local area network (Local Area Network, LAN), a wide area network (Wide Area Network, WAN), and/or a public network, such as the Internet) through the network adapter 420.
  • the processor 416 executes various functional applications and data processing by running the program stored in the system memory 428, for example, to implement a convolutional neural network processing method provided in an embodiment of the present application, including: acquiring the convolutional neural network The original weight matrix and the original input neuron matrix. The original weight matrix is sequentially subjected to Winograd transform and quantization to obtain the target weight matrix, and the original input neuron matrix is sequentially subjected to quantization and Winograd transform to obtain the target input neuron matrix. According to the target weight matrix and the target input neuron matrix, the output neuron matrix of the convolutional neural network is obtained.
  • networks such as a local area network (Local Area Network, LAN), a wide area network (Wide Area Network, WAN),
  • An embodiment of the present application further provides a computer-readable storage medium that stores a computer program, and when the program is executed by a processor, a convolutional neural network processing method as provided in the embodiment of the present application is implemented.
  • the method includes: acquiring The original weight matrix of the convolutional neural network and the original input neuron matrix.
  • the original weight matrix is sequentially subjected to Winograd transform and quantization to obtain the target weight matrix
  • the original input neuron matrix is sequentially subjected to quantization and Winograd transform to obtain the target input neuron matrix.
  • the output neuron matrix of the convolutional neural network is obtained.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)

Abstract

本文公开了一种卷积神经网络处理方法、装置、设备及存储介质。该方法包括:获取卷积神经网络的原始权值矩阵和原始输入神经元矩阵,对原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,以及,对原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵,根据目标权值矩阵和所述目标输入神经元矩阵,得到卷积神经网络的输出神经元矩阵。

Description

卷积神经网络处理方法、装置、设备及存储介质
本申请要求在2018年12月28日提交中国专利局、申请号为201811627040.3的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及深度学习技术,例如涉及一种卷积神经网络处理方法、装置、设备及存储介质。
背景技术
自2012年AlexNet网络结构模型被提出以来,卷积神经网络在图像处理领域取得了巨大成功,在多个图像竞赛中,卷积神经网络效果远超传统算法,频繁刷新业内多种评测指标。
由于信息安全和低时延的需求,神经网络的计算需要从云端迁移到移动终端,但随着卷积神经网络效果的提升,卷积神经网络的模型越来越复杂,计算量急剧增加。在云端,可以依靠图形处理器(Graphics Processing Unit,GPU)并行计算,对卷积神经网络进行加速,而在移动终端,由于计算资源相对匮乏,使得卷积神经网络的计算速度无法得到提高,进而无法在移动终端实现实时运行。
发明内容
本申请实施例提供一种卷积神经网络处理方法、装置、设备及存储介质,以提高卷积神经网络的计算速度。
本申请实施例提供了一种卷积神经网络处理方法,该方法包括:
获取卷积神经网络的原始权值矩阵和输入神经元矩阵;
对所述原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,以及,对所述原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵;
根据所述目标权值矩阵和所述目标输入元矩阵,得到所述卷积神经网络的输出神经元矩阵。
本申请实施例还提供了一种卷积神经网络处理装置,该装置包括:
原始矩阵获取模块,设置为获取卷积神经网络的原始权值矩阵和原始输入 神经元矩阵;
目标矩阵生成模块,设置为对所述原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,以及,对所述原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵;
输出神经元矩阵生成模块,设置为根据所述目标权值矩阵和所述目标输入神经元矩阵,得到所述卷积神经网络的输出神经元矩阵。
本申请实施例还提供了一种设备,该设备包括:
一个或多个处理器;
存储器,设置为存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本申请实施例提供的方法。
本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本申请实施例提供的方法。
附图说明
图1是本申请实施例提供的一种卷积神经网络处理方法;
图2是本申请实施例提供的另一种卷积神经网络处理方法;
图3是本申请实施例提供的一种卷积神经网络处理装置的结构示意图;
图4是本发明实施例中的一种设备的结构示意图。
具体实施方式
下面结合附图和实施例对本申请进行说明。此处所描述的具体实施例仅仅用于解释本申请,而非对本申请的限定。为了便于描述,附图中仅示出了与本申请相关的部分而非全部结构。
卷积神经网络一般由以下三部分组成,第一部分是输入层,第二部分由卷积层、激活层和池化层(或下采样层)组合而成,第三部分由一个全连接的多层感知机分类器(即全连接层)构成。卷积层负责特征提取,使用了两个关键的概念:接收域和权值共享;池化层执行局部平均和子采样,减少特征对偏移和畸变的敏感性,这是由于特征的精确位置是次要的,而相对于其它特性的相对位置更重要;全连接层执行分类。
卷积层是卷积神经网络的核心,卷积层由一些称为特征图的二维神经元面组成,一个特征图上的每个神经元与前一层神经元的一个小子集(称为接收域) 连接,二维空间上的局部接收域使得卷积神经网络可以从输入中提取初级特征。一个特征图中的神经元共享一个或多个权值矩阵(或称卷积核)。权值共享不仅大幅减少了需要训练的权值数目,也大大减少了需要的训练数据量,且降低了神经网络的复杂性,减轻了过拟合的问题。上述特征提取过程即为卷积运算过程。同时,可将特征图理解为神经元矩阵,相应的,可将当前层未进行特征提取(即卷积运算)的特征图作为输入神经元矩阵,将当前层经过特征提取(即卷积运算)的特征图作为输出神经元矩阵。当前层的输出神经元矩阵即为下一层的输入神经元矩阵。
卷积层的卷积运算占据了卷积神经网络的大部分计算量,计算量即指网络的运算次数,而网络的运算次数可用时间复杂度来表示。下面给出卷积运算的计算过程:假设原始输入神经元矩阵M in为W in*H in的矩阵,原始权值矩阵N为F*F的矩阵,卷积步长为S,零填充个数为P,则每个输出神经元矩阵M out为W out*H out的矩阵,输出神经元矩阵的个数等于K,即输出神经元矩阵的个数等于权值矩阵个数,W out=(W in-F+2P)/S+1,H out=(W out-F+2P)/S+1。由此可知,根据原始输入神经元矩阵和原始权值矩阵,得到每个输出神经元矩阵需要经过的乘法运算次数为W out*H out*F*F,卷积运算中的乘法运算是卷积神经网络的时间复杂度的一部分。
为了提高卷积神经网络的计算速度,可考虑从以下两方面进行改进:其一、降低卷积神经网络的时间复杂度。由于卷积运算中的乘法运算是卷积神经网络的时间复杂度的一部分,因此,可考虑减少乘法运算的次数,可以采用非耗时运算(如加法运算)替代耗时运算(如乘法运算),如将数据从时域转换到频域进行处理;其二、降低卷积神经网络的数据位宽。可以采用量化方法进行。采用非耗时运算操作替代耗时运算操作以降低卷积神经网络的复杂度的前提是:耗时运算的时钟周期要大于非耗时运算的时钟周期。
由于采样量化方法来降低卷积神经网络的数据位宽时,将损失部分数据精度,此外,采用非耗时运算替代耗时运算来降低卷积神经网络的时间复杂度时,也将损失数据精度。上述实现方式是通过降低卷积神经网络的结果精度来提高卷积神经网络的计算速度的,如果通过上述处理方式所得结果的误差在预设误差范围内,那么可认为上述处理方式是可行的。如果通过上述处理方式所得结果的误差不在预设范围内,那么即使采用上述处理方式提高了卷积神经网络的计算速度,上述处理方式也将认为是不可行的。换句话说,上述处理方式是否可行的前提是,通过上述处理方式所得结果的误差在预设误差范围内。相应的,为了确保上述处理方式可行,需要考虑如何降低结果的误差,即提高结果精度。下面将结合具体实施例对上述内容进行说明。
图1为本申请实施例提供的一种卷积神经网络处理方法的流程图,本实施例可适用于提高卷积神经网络的计算速度的情况,该方法可以由卷积神经网络确定装置来执行,该装置可以采用软件和/或硬件的方式实现,该装置可以配置于设备中,例如配置于计算机或移动终端等中。如图1所示,该方法包括如下步骤:
步骤110、获取卷积神经网络的原始权值矩阵和原始输入神经元矩阵。
步骤120、对原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵。
步骤130、对原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵。
在本申请的实施例中,根据前文所述可知,为了提高卷积神经网络的计算速度,可考虑从降低卷积神经网络的时间复杂度以及降低卷积神经网络的数据位宽两个方面进行改进。针对降低卷积神经网络的时间复杂度,可以采用非耗时运算(如加法运算)替代耗时运算(如乘法运算)的方式,即增加非耗时运算操作,减少耗时运算操作。一实施例中,可以采用Winograd变换来实现降低卷积神经网络的复杂度。下面对Winograd变换进行说明。
Winograd变换是一种广泛使用的矩阵串行乘法,Winograd变换的思想是增加少量加法运算而减少乘法运算。Winograd变换公式为:Y=A T[N w□W inw]A,A表示输出神经元变换矩阵,A T表示输出神经元变换矩阵的转置矩阵,N w表示对原始权值矩阵N进行Winograd变换得到的矩阵,且N w=GNG T,G表示原始权值变换矩阵;W inw表示对原始输入神经元矩阵W in进行Winograd变换得到的矩阵,且W inw=B TW inB,B T表示原始输入神经元变换矩阵。输出神经元变换矩阵A、原始权值变换矩阵G与原始输入神经元变换矩阵B T可以根据预设Winograd变换参数确定,这里所述的预设Winograd变换参数指的是输出神经元矩阵M out的维数与原始权值矩阵N的维数,输出神经元矩阵M out为W out*H out的矩阵,原始权值矩阵N为F*F的矩阵,即原始权值变换矩阵G与原始输入神经元变换矩阵B T可以根据F(W out*H out,F*F)确定。
当输出神经元矩阵M out为W out*H out的矩阵,原始权值矩阵N为F*F的矩阵时,基于Winograd变换完成卷积运算所需要的乘法次数μ(F(W out*H out,F*F))=(W out+F-1)(H out+F-1),即对原始输入神经元矩阵W in以及原始权值矩阵N进行Winograd变换后,根据变换后的原始输入神经元矩阵和原始权值矩阵得到输出神经元矩阵所需要经过的乘法运算次数为μ(F(W out*H out,F*F))=(W out+F-1)(H out+F-1)。根据前文所述可知,未采 用Winograd变换完成卷积操作所需要的乘法次数为μ(F(W out*H out,F*F))=W out*H out*F*F,即未对原始输入神经元矩阵W in以及原始权值矩阵N进行Winograd变换,得到输出神经元矩阵所需要经过的乘法运算次数为μ(F(W out*H out,F*F))=W out*H out*F*F。基于上述,采用Winograd变换可减少卷积神经网络的乘法运算次数,即采用Winograd变换可降低卷积神经网络的时间复杂度。
上述给出了当原始权值矩阵的个数为一个的Winograd变换情况,当原始权值矩阵的个数为至少两个以上时,每个原始权值矩阵均可采用上述方式进行Winograd变换,并将Winograd变换后的结果相加即可。
示例性的,如当输出神经元M out为2*2,原始权值矩阵N为3*3,则
Figure PCTCN2019125047-appb-000001
采用Winograd变换得到输出神经元所需要的乘法运算次数μ(F(2*2,3*3))=4,而未采用Winograd变换得打输出神经元所需要的乘法运算次数为μ(F(2*2,3*3))=36,由此可见,采用Winograd变换可将低卷积神经网络的时间复杂度。
由于卷积神经网络的训练过程是一个不断对权值进行调整的过程,这个对权值所进行的调整一般需要浮点精度才能完成,因此,原始权值矩阵和原始输入神经元矩阵的元素通常是以浮点数类型存储的。但浮点运算会消耗比较大的计算资源,如果在不影响卷积神经网络的结果的精确度的情况下,这里所述的不影响卷积神经网络的结果的精确度指的是对卷积神经网络进行处理后所得结果的误差在预设误差范围内,则可以采用其它简单数值类型进行计算的话,卷积神经网络的计算速度将得到极大提高,同时也可降低功耗,这对于移动终端来说,如不能高效执行浮点运算的嵌入式系统来说,尤为重要。基于上述,可以考虑采用量化方法来降低卷积神经网络的数据位宽,从而提高卷积神经网络的计算速度。
量化指的是利用浮点运算精度的冗余进行原始权值矩阵和原始输入神经元矩阵的数据压缩,由于数据压缩不会更改网络结构,因此,通常不需要重新训练。可以对卷积神经网络的原始权值矩阵和原始输入神经元矩阵进行量化的原因在于:卷积运算的原始权值矩阵和原始输入神经元矩阵的元素值一般是以32 位浮点数形式存在,而由于通用的中央处理器或数字信号处理器最小可以支持8位定点数,且大量实验表明,卷积神经网络对于数据位宽的敏感性不高,也即降低卷积神经网络的数据位宽对结果所造成的影响较小,因此,可以对原始权值矩阵和原始输入神经元矩阵进行量化处理,即将原始权值矩阵和原始输入神经元矩阵的元素值从32位量化至8位。这里所述的8位定点数指的是8位定点整数,即8位二进制数。由于8位数值的存取相对浮点数而言,使得内存带宽降到25%,上述可以更好地利用缓存并且避免随机存取存储器(Random Access Memory,RAM)出现存取瓶颈。此外,还可以提高单指令多数据流(Single Instruction Multiple Data,SIMD)的吞吐量,即实现SIMD在每个时钟周期内执行更多的操作。
量化方法可以包括最大最小值量化方法和信息熵量化方法等。其中,在信息论中,熵是信息不确定性的一个测度,熵越大,则表示信息的不去定程度越高。最大最小值量化方法为:在选择将32位浮点数量化至8位定点数的情况下,确定参数的最大浮点值a和最小浮点值b,根据最大浮点值a和最小浮点值b确定参数的分布区间[a,b],将所有参数均线性映射到分布区间[a,b]之间的255个8位定点数中最接近的一个数值,这里所述的参数为原始权值矩阵的元素以及原始输入神经元中的元素,此外,也可以采用非线性映射将所有参数映射到分布区间[a,b]之间的255个8位定点数中最接近的一个数值。示例性的,如原始权值矩阵的元素的分布区间为[-3,5],采用线性映射方法将原始权值矩阵中的所有元素映射到分布区间[-3,5]之间255个8位定点数中最接近的一个数值,则可将-3映射为-127,将5映射为127。
根据前文所述可知,由于降低卷积神经网络的时间复杂度以及降低卷积神经网络的数据位宽都将降低卷积神经网络的结果精确度,因此,如果将两者进行结合而不考虑实际情况,则将导致卷积神经网络的结果所产生的误差大于预设误差范围内,进而认为该结果并可信,上述处理方式也并不可行。基于上述,需要考虑实际情况来确定可行的处理方式,这里所述的实际情况主要指:由于原始权值矩阵中元素的分布范围与原始输入神经元矩阵中元素的分布范围并不相同,原始权值矩阵中元素的分布范围比较集中,而原始输入神经元矩阵中元素的分布范围比较广,因此,原始权值矩阵与原始输入神经元矩阵对量化位宽的敏感程度并不相同,进而需要针对原始权值矩阵和原始输入神经元矩阵,考虑执行降低卷积神经网络的时间复杂度以及降低卷积神经网络的数据位宽的先后顺序,即考虑执行Winograd变换以及量化处理的先后顺序。
针对原始权值矩阵而言,由于原始权值矩阵中元素的分布范围比较集中,因此,对原始权值矩阵进行量化处理时,比较容易确定阈值,这里所述的阈值 可以指元素的最大浮点值与最小浮点值。根据阈值对原始权值矩阵中元素进行筛选时,可将原始权值矩阵中大部分元素包含在内,使得原始权值矩阵中元素丢失较少,即大部分元素均可以被利用,并可将其映射到一个较小范围的分布区间。大量实验表明,针对原始权值矩阵而言,将原始权值矩阵中多个元素分别由32位量化到8位,无论采用哪种量化方法,量化后的结果的精确度损失并不大,甚至还可以量化到更小的数据位宽,量化所产生的误差主要来源于取整丢失的精确度。基于上述,针对原始权值矩阵,可以先对原始权值矩阵执行Winograd变换,再对Winograd变换后的原始权值矩阵执行量化处理。如果先对原始权值矩阵执行量化处理,再对量化处理后的原始权值矩阵执行Winograd变换,则由于量化处理会累加放大原始权值矩阵中元素在量化取整时所产生的误差,因此,将影响卷积神经网络的结果。
针对原始输入神经元矩阵而言,由于输入神经元矩阵中元素的分布范围比较广,因此,对原始输入神经元矩阵进行量化处理时,比较难确定阈值,这里所述的阈值可以指元素的最大浮点值与最小浮点值。根据阈值对原始输入神经元矩阵中元素进行筛选时,可能无法将原始输入神经元矩阵中大部分元素包含在内,使得原始输入神经元矩阵中元素丢失较多,即大部分元素可能无法被利用。大量实验表明,针对原始输入神经元矩阵而言,将原始输入神经元矩阵中多个中的每个元素元素由32位量化到8位,采用信息熵量化方法的效果较好,且在时域处理的效果优于频域处理的效果。基于上述,针对原始输入神经元矩阵,由于在时域中量化处理的效果优于频域中量化处理的效果,而Winograd变换是将数据从时域转换到频域,因此,需要先对原始输入神经元矩阵执行量化处理,再对量化处理后的原始输入神经元矩阵执行Winograd变换。
基于上述,本申请实施例中卷积神经网络的原始权值矩阵表示的是未经过Winograd变换和量化处理的权值矩阵,卷积神经网络的原始输入神经元矩阵表示的是未经过量化处理和Winograd变换的输入神经元矩阵。
对原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,可作如下理解:先对原始权值矩阵进行Winograd变换,得到Winograd变换权值矩阵,再对Winograd变换权值矩阵进行量化处理,得到目标权值矩阵。本实施例中,可采用最大最小值量化方法或信息熵量化方法对Winograd变换权值矩阵进行量化处理,采用哪种量化方法,可根据实际情况进行设定,在此不作限定。
对原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵,可作如下理解:先对原始输入神经元矩阵进行量化处理,得到量化输入神经元矩阵,再对量化输入神经元矩形进行Winograd变换,得到目标输 入神经元矩阵。本实施例中,可以采用信息熵方法对原始输入神经元矩阵进行量化处理。
步骤140、根据目标权值矩阵和目标输入神经元矩阵,得到卷积神经网络的输出神经元矩阵。
在本申请的实施例中,由于Winograd变换将矩阵由时域转换到了频域,因此,时域中原始权值矩阵和原始目标输入神经元矩阵执行的卷积操作,转换为,频域中目标权值矩阵和目标输入神经元矩阵执行的点乘操作。根据目标权值矩阵和目标输入神经元矩阵得到的卷积神经网络的输出神经元矩阵为频域卷积神经网络的结果,之后可以对该输出神经元矩阵进行Winograd变换将其转换为时域卷积神经网络的结果,即作为下一层的原始输入神经元矩阵。
步骤120和步骤130可以同步执行,也可以先执行步骤120,再执行步骤130,还可以先执行步骤130,再执行步骤120,执行先后顺序,可根据实际情况进行确定,在此不作限定。
本实施例的技术方案,通过获取卷积神经网络的原始权值矩阵和原始输入神经元矩阵,对原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,以及,对原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵,根据目标权值矩阵和目标输入神经元矩阵,得到卷积神经网络的输出神经元矩阵,实现了通过Winograd变换降低卷积神经网络的时间复杂度,以及通过量化处理降低卷积神经网络的数据位宽,进而提高了卷积神经网络的计算速度。
在上述技术方案的基础上,对原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,可以包括:获取目标权值变换矩阵。根据目标权值变换矩阵,对原始权值矩阵进行Winograd变换,得到Winograd变换权值矩阵。对Winograd变换权值矩阵进行量化处理,得到目标权值矩阵。
在本申请的实施例中,对原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,可以包括:获取目标权值变换矩阵,根据目标权值变换矩阵,对原始权值矩阵进行Winograd变换,得到Winograd变换权值矩阵,再对Winograd权值变换矩阵进行量化处理,得到目标权值矩阵,这里所述的目标权值变换矩阵可以根据预设Winograd变换参数确定,这里所述预设Winograd变换参数指的是输出神经元矩阵M out的维数与原始权值矩阵N的维数。本实施例中,输出神经元矩阵M out为W out*H out的矩阵,原始权值矩阵N为F*F的矩阵,即目标权值变换矩阵根据F(W out*H out,F*F)确定。
示例性的,如F(2*2,3*3),则目标权值变换矩阵为
Figure PCTCN2019125047-appb-000002
在上述技术方案的基础上,对原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵,可以包括:获取目标输入神经元变换矩阵。对原始输入神经元矩阵进行量化处理,得到量化输入神经元矩阵。根据目标输入神经元变换矩阵,对量化输入神经元矩阵进行Winograd变换,得到目标输入神经元矩阵。
在本申请的实例中,对原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵,可以包括:获取目标输入神经元变换矩阵,对原始输入神经元矩阵进行量化处理,得到量化输入神经元矩阵,再根据目标输入神经元变换矩阵,对量化输入神经元矩阵进行Winograd变换,得到目标输入神经元矩阵,这里所述的目标输入神经元变换矩阵可以根据预设Winograd变换参数确定,这里所述的预设Winograd变换参数指的是输出神经元矩阵M out的维数与原始权值矩阵N的维数。本实施例中,输出神经元矩阵M out为W out*H out的矩阵,原始权值矩阵N为F*F的矩阵,即目标输入神经元变换矩阵根据F(W out*H out,F*F)确定。
示例性的,如F(2*2,3*3),则目标输入神经元变换矩阵为
Figure PCTCN2019125047-appb-000003
在上述技术方案的基础上,目标权值变换矩阵和目标输入神经元变换矩阵可以通过如下方式确定:根据预设Winograd变换参数确定原始权值变换矩阵和原始输入神经元变换矩阵,原始权值变换矩阵的行数和原始输入神经元变换矩阵的行数相等。在根据原始输入神经元变换矩阵对所述量化输入神经元矩阵进行Winograd变换的过程中出现数据溢出的情况下,将原始输入神经元变换矩阵中所述数据对应的第i行的多个元素值分别乘以预设比例值,得到目标输入神经元变换矩阵,并将原始权值变换矩阵中第i行的多个元素值分别除以预设比例值,得到目标权值变换矩阵,预设比例值小于1。i为小于或等于n的正整数,n为所述原始输入神经元变换矩阵的行数。
在本申请的实施例中,由于原始输入神经元矩阵通常是经过激活层处理后的矩阵,激活层通常采用的激活函数为修正线性单元(Rectified Linear Unit,ReLU)函数,ReLU函数为f(x)=max(0,x),f(x)可以表示原始输入神经元矩阵的元素值,对原始输入神经元矩阵经过量化处理后,即将原始输入神经元矩阵的多个元素分别由32位量化至8位,则得到量化输入神经元矩阵的元素的取值范围为[0,127],再对量化输入神经元矩阵进行Winograd变换,所得到的目标输入神经元矩阵的元素值将可能超过取值范围[0,127],即超出8位存储范围,上述可以理解为在对原始输入神经元矩阵进行量化处理过程中,出现了溢出问题。为了解决上述问题,可考虑对Winograd变换中涉及的变换矩阵进行调整。
根据前文所述可知,Winograd变换公式为:Y=A T[N w□W inw]A,A表示输出神经元变换矩阵,A T表示输出神经元变换矩阵的转置矩阵,N w表示对原始权值矩阵N进行Winograd变换得到的矩阵,且N w=GNG T,G表示原始权值变换矩阵;W inw表示对原始输入神经元矩阵W in进行Winograd变换得到的矩阵,且W inw=B TW inB,B T表示原始输入神经元变换矩阵。输出神经元变换矩阵A、原始权值变换矩阵G与原始输入神经元变换矩阵B T可以根据预设Winograd变换参数确定,这里所述的预设Winograd变换参数指的是输出神经元矩阵M out的维数与原始权值矩阵N的维数,输出神经元矩阵M out为W out*H out的矩阵,原始权值矩阵N为F*F的矩阵,即原始权值变换矩阵G与原始输入神经元变换矩阵B T可以根据F(W out*H out,F*F)确定,并且原始权值变换矩阵G的行数与原始输入神经元变换矩阵B T的行数相等。
判断原始输入神经元变换矩阵对所述量化输入神经元矩阵进行Winograd变换的过程中是否出现数据溢出,如果出现数据溢出,则可以将原始输入神经元变换矩阵中所述数据对应的第i行的多个元素值分别乘以预设比例值,该预设比例值大于0,取值范围一般为(0,1)。同时,由于卷积神经网络的输出神经元矩阵是目标权值矩阵和目标输入神经元矩阵进行点乘后得到的结果,因此,为了使输出神经元矩阵的所有元素值不变,可将原始权值变换矩阵中对应的第i行的多个元素值分别除以预设比例值,得到目标输入神经元变换矩阵和目标权值变换矩阵。经过上述处理后,根据目标输入神经元变换矩阵和量化输入神经元矩阵进行Winograd变换,所得到的目标输入神经元矩阵的元素值将位于取值范围[0,127]之内,即不会超出取值范围[0,127]。
若根据原始输入神经元变换矩阵对量化输入神经元矩阵进行Winograd变换的过程中未出现数据溢出,则无需对原始输入神经元变换矩阵中的元素进行调整,原始输入神经元变换矩阵即为目标输入神经元变换矩阵。
上述对原始输入神经元变换矩阵中多个元素以及原始权值变换矩阵中多个元素进行调整时,可以直接采用SIMD中的单指令vrhadd实现移位操作,上述也不会造成额外的指令开销。
上述当根据原始输入神经元变换矩阵对量化输入神经元矩阵进行Winograd变换的过程中出现数据溢出时,通过将原始输入神经元变换矩阵中所述数据对应的第i行的多个元素值分别乘以预设比例值,以及,将原始权值变换矩阵中第i行的多个元素值分别除以预设比例值,预设比例值小于1,i为小于或等于n的正整数,n为所述原始输入神经元变换矩阵的行数,使得在保证卷积神经网络的输出神经元矩阵的多个元素值不变的基础上,实现目标输入神经元矩阵的元素值位于取值范围[0,127]内,从而解决了在对原始输入神经元矩阵进行量化处理过程中所出现的溢出问题。
为了理解本申请实施例所提供的技术方案,现以F(2*2,3*3)为例进行说明。根据F(2*2,3*3)可以确定原始输入神经元变换矩阵为
Figure PCTCN2019125047-appb-000004
原始权值变换矩阵为
Figure PCTCN2019125047-appb-000005
假设量化输入神经元矩阵为
Figure PCTCN2019125047-appb-000006
在将B T与量化输入神经元矩阵为进行乘积运算时,B T中第2行元素与量化输入神经元矩阵的运算结果为b+c,其中,由于b和c均为[0,127]之间的数,可能会存在b+c大于127的情况,即溢出问题。一实施例中,一旦出现溢出问题,设置预设比例值为
Figure PCTCN2019125047-appb-000007
将原始输入神经元变换矩阵B T中第2行的多个元素值分别乘以预设比例值,原始神经元变换矩阵B T中除第2行外的其它多行中的元素值保持不变,得到目标输入神经元变换矩阵
Figure PCTCN2019125047-appb-000008
并将原始权值变换矩阵G中第2行的多个元素分别除以 预设比例值
Figure PCTCN2019125047-appb-000009
原始权值变换矩阵G中除第2行外的其它多行中的元素值保持不变,得到目标权值变换矩阵
Figure PCTCN2019125047-appb-000010
在上述技术方案的基础上,对原始输入神经元矩阵进行量化处理,得到量化输入神经元矩阵,可以包括:将原始输入神经元矩阵中多个元素值分别加上预设调整值,得到调整后的原始输入神经元矩阵。对调整后的原始输入神经元矩阵向零取整,得到量化输入神经元矩阵。
在本申请的实施例中,为了尽可能降低量化所产生的误差,可将原始输入神经元矩阵中多个元素值分别加上预设调整值,得到调整后的原始输入神经元矩阵,这里预设调整值可以为0.5浮点数(float),并采用向零取整的方法,对调整后的原始输入神经元矩阵进行量化处理,得到量化输入神经元矩阵。
在上述技术方案的基础上,目标权值矩阵中多个元素的取值范围均可以为[-127,127]。
在本申请的实例中,由于量化处理后,将目标权值矩阵中多个元素分别由32位量化到8位,因此,目标权值矩阵中多个元素的取值范围均可以为[0,127]。
在上述技术方案的基础上,目标输入神经元矩阵中多个元素的取值范围均可以为[0,127]。
在本申请的实例中,由于原始输入神经元矩阵通常是经过激活层处理后的矩阵,激活层通常采用的激活函数为ReLU函数,ReLU函数为f(x)=max(0,x),f(x)可以表示原始输入神经元矩阵的元素值,对原始输入神经元矩阵经过量化处理后,即将原始输入神经元矩阵的多个元素分别由32位量化至8位,则得到量化输入神经元矩阵的元素的取值范围为[0,127],在此基础上,对量化输入神经元矩阵进行Winograd变换得到的目标输入神经元矩阵中多个元素的取值范围可以均为[0,127]。
本申请实施例所提供的技术方案,通过采用Winograd变换降低了卷积神经网络的时间复杂度,以及,通过采用量化方法降低了卷积神经网络的数据位宽,并根据原始权值矩阵和原始输入神经元矩阵中元素的分布情况,针对原始权值矩阵,确定采用先执行Winograd变换,再执行量化处理的方式,针对原始输入神经元矩阵,确定采用先执行量化处理,再执行Winograd变换的方式,实现了最大程度利用Winograd变换和量化方法的优点,在确保上述处理操作所产生的 误差在预设误差范围内的基础上,大幅度提高了卷积神经网络的计算速度,上述使得卷积神经网络在移动设备上基本可以达到实时运行的要求。
图2为本申请实施例提供的另一种卷积神经网络处理方法的流程图,本实施例可适用于提高卷积神经网络的计算速度的情况,该方法可以由卷积神经网络确定装置来执行,该装置可以采用软件和/或硬件的方式实现,该装置可以配置于设备中,例如配置于计算机或移动终端等中。如图2所示,该方法包括如下步骤:
步骤2010、获取卷积神经网络的原始权值矩阵和原始输入神经元矩阵。
步骤2020、根据预设Winograd变换参数确定原始权值变换矩阵和原始输入神经元变换矩阵,原始权值变换矩阵的行数和原始输入神经元变换矩阵的行数相等。
步骤2030、判断根据原始输入神经元变换矩阵对量化输入神经元矩阵进行Winograd变换的过程中是否出现数据溢出,若出现数据溢出,则执行步骤2040;若未出现数据溢出,则执行步骤2050。i=为小于或等于n的正整数,n为所述原始输入神经元变换矩阵的行数。
步骤2040、将原始输入神经元变换矩阵中第i行的多个元素值分别乘以预设比例值,得到原始输入神经元变换矩阵中第i行新的多个元素值,并将原始权值变换矩阵中第i行的多个元素值分别除以预设比例值,得到原始权值变换矩阵中第i行新的多个元素值,预设比例值小于1。
步骤2050、保持原始输入神经元变换矩阵中第i行的多个元素值不变,以及,保持原始权值变换矩阵中第i行的多个元素值不变。
步骤2060、生成目标输入神经元变换矩阵和目标权值变换矩阵。
步骤2070、根据目标权值变换矩阵,对原始权值矩阵进行Winograd变换,得到Winograd变换权值矩阵。
步骤2080、对Winograd变换权值矩阵进行量化处理,得到目标权值矩阵。
步骤2090、对原始输入神经元矩阵进行量化处理,得到量化输入神经元矩阵。
步骤2100、根据目标输入神经元变换矩阵,对量化输入神经元矩阵进行Winograd变换,得到目标输入神经元矩阵。
步骤2110、根据目标权值矩阵和目标输入神经元矩阵,得到卷积神经网络的输出神经元矩阵。
在本申请的实施例中,步骤2070-步骤2080与步骤2090-步骤2100可以同 步执行,也可以先依次执行步骤2070-步骤2080,再依次执行步骤2090-步骤2100,还可以先依次执行步骤2090-步骤2100,再依次执行步骤2070-步骤2080。执行顺序,可根据实际情况进行确定,在此不作限定。
本实施例的技术方案,通过获取卷积神经网络的原始权值矩阵和原始输入神经元矩阵,对原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,以及,对原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵,根据目标权值矩阵和目标输入神经元矩阵,得到卷积神经网络的输出神经元矩阵,实现了通过Winograd变换降低卷积神经网络的时间复杂度,以及通过量化处理降低卷积神经网络的数据位宽,进而提高了卷积神经网络的计算速度。
图3为本申请实施例提供的一种卷积神经网络处理装置的结构示意图,本实施例可适用于提高卷积神经网络的计算速度的情况,该装置可以采用软件和/或硬件的方式实现,该装置可以配置于设备中,例如配置于计算机或移动终端等中。如图3所示,该装置包括:
原始矩阵获取模块310,设置为获取卷积神经网络的原始权值矩阵和原始输入神经元矩阵。
目标矩阵生成模块320,设置为对原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,以及,对原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵。
输出神经元矩阵生成模块330,设置为根据目标权值矩阵和目标输入神经元矩阵,得到卷积神经网络的输出神经元矩阵。
本实施例的技术方案,通过获取卷积神经网络的原始权值矩阵和原始输入神经元矩阵,对原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,以及,对原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵,根据目标权值矩阵和目标输入神经元矩阵,得到卷积神经网络的输出神经元矩阵,实现了通过Winograd变换降低卷积神经网络的时间复杂度,以及通过量化处理降低卷积神经网络的数据位宽,进而提高了卷积神经网络的计算速度。
图4为本申请实施例提供的一种设备的结构示意图。图4示出了适于用来实现本实施方式的示例性设备412的框图。如图4所示,设备412以通用计算设备的形式表现。设备412的组件可以包括但不限于:一个或者多个处理器416,系统存储器428,连接于不同系统组件(包括系统存储器428和处理器416)的总线418。系统存储器428可以包括易失性存储器形式的计算机系统可读介质, 例如随机存取存储器(Random Access Memory,RAM)730和/或高速缓存存储器432。仅作为举例,存储系统434可以用于读写不可移动的、非易失性磁介质(图4未显示,通常称为“硬盘驱动器”)。具有一组(至少一个)程序模块442的程序/实用工具440,可以存储在例如存储器428中。设备412也可以与一个或多个外部设备414(例如键盘、指向设备、显示器424等)通信。这种通信可以通过输入/输出(Input/Output,I/O)接口422进行。并且,设备412还可以通过网络适配器420与一个或者多个网络(例如局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN)和/或公共网络,例如因特网)通信。处理器416通过运行存储在系统存储器428中的程序,从而执行各种功能应用以及数据处理,例如实现本申请实施例所提供的一种卷积神经网络处理方法,包括:获取卷积神经网络的原始权值矩阵和原始输入神经元矩阵。对原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,以及,对原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵。根据目标权值矩阵和目标输入神经元矩阵,得到卷积神经网络的输出神经元矩阵。
本申请实施例还提供了一种计算机可读存储介质,存储有计算机程序,该程序被处理器执行时实现如本申请实施例所提供的一种卷积神经网络处理方法,该方法包括:获取卷积神经网络的原始权值矩阵和原始输入神经元矩阵。对原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,以及,对原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵。根据目标权值矩阵和目标输入神经元矩阵,得到卷积神经网络的输出神经元矩阵。

Claims (10)

  1. 一种卷积神经网络处理方法,包括:
    获取卷积神经网络的原始权值矩阵和原始输入神经元矩阵;
    对所述原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,以及,对所述原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵;
    根据所述目标权值矩阵和所述目标输入神经元矩阵,得到所述卷积神经网络的输出神经元矩阵。
  2. 根据权利要求1所述的方法,其中,所述对所述原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,包括:
    获取目标权值变换矩阵;
    根据所述目标权值变换矩阵,对所述原始权值矩阵进行Winograd变换,得到Winograd变换权值矩阵;
    对所述Winograd变换权值矩阵进行量化处理,得到所述目标权值矩阵。
  3. 根据权利要求2所述的方法,其中,所述对所述原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵,包括:
    获取目标输入神经元变换矩阵;
    对所述原始输入神经元矩阵进行量化处理,得到量化输入神经元矩阵;
    根据所述目标输入神经元变换矩阵,对所述量化输入神经元矩阵进行Winograd变换,得到所述目标输入神经元矩阵。
  4. 根据权利要求3所述的方法,其中,所述目标权值变换矩阵和所述目标输入神经元变换矩阵通过如下方式确定:
    根据预设Winograd变换参数确定原始权值变换矩阵和原始输入神经元变换矩阵,所述原始权值变换矩阵的行数和所述原始输入神经元变换矩阵的行数相等;
    在根据所述原始输入神经元变换矩阵对所述量化输入神经元矩阵进行Winograd变换的过程中出现数据溢出的情况下,将所述原始输入神经元变换矩阵中所述数据对应的第i行的多个元素值分别乘以预设比例值,得到所述目标输入神经元变换矩阵,并将所述原始权值变换矩阵中第i行的多个元素值分别除以所述预设比例值,得到所述目标权值变换矩阵,所述预设比例值小于1,i为小于或等于n的正整数,n为所述原始输入神经元变换矩阵的行数。
  5. 根据权利要求3或4所述的方法,其中,所述对所述原始输入神经元矩 阵进行量化处理,得到量化输入神经元矩阵,包括:
    将所述原始输入神经元矩阵中多个元素值分别加上预设调整值,得到调整后的原始输入神经元矩阵;
    对所述调整后的原始输入神经元矩阵向零取整,得到量化输入神经元矩阵。
  6. 根据权利要求1-5任一所述的方法,其中,所述目标权值矩阵中多个元素的取值范围均为[-127,127]。
  7. 根据权利要求1-6任一所述的方法,其中,所述目标输入神经元矩阵中多个元素的取值范围均为[0,127]。
  8. 一种卷积神经网络处理装置,包括:
    原始矩阵获取模块,设置为获取卷积神经网络的原始权值矩阵和原始输入神经元矩阵;
    目标矩阵生成模块,设置为对所述原始权值矩阵依次进行Winograd变换和量化处理,得到目标权值矩阵,以及,对所述原始输入神经元矩阵依次进行量化处理和Winograd变换,得到目标输入神经元矩阵;
    输出神经元矩阵生成模块,设置为根据所述目标权值矩阵和所述目标输入神经元矩阵,得到所述卷积神经网络的输出神经元矩阵。
  9. 一种设备,包括:
    至少一个处理器;
    存储器,设置为存储至少一个程序;
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-7任一所述的方法。
  10. 一种计算机可读存储介质,存储有计算机程序,所述程序被处理器执行时实现如权利要求1-7任一所述的方法。
PCT/CN2019/125047 2018-12-28 2019-12-13 卷积神经网络处理方法、装置、设备及存储介质 WO2020135093A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811627040.3A CN111382854B (zh) 2018-12-28 2018-12-28 一种卷积神经网络处理方法、装置、设备及存储介质
CN201811627040.3 2018-12-28

Publications (1)

Publication Number Publication Date
WO2020135093A1 true WO2020135093A1 (zh) 2020-07-02

Family

ID=71127510

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/125047 WO2020135093A1 (zh) 2018-12-28 2019-12-13 卷积神经网络处理方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN111382854B (zh)
WO (1) WO2020135093A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022227024A1 (zh) * 2021-04-30 2022-11-03 华为技术有限公司 神经网络模型的运算方法、训练方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229654A (zh) * 2016-12-14 2018-06-29 上海寒武纪信息科技有限公司 神经网络卷积运算装置及方法
CN108229656A (zh) * 2016-12-14 2018-06-29 上海寒武纪信息科技有限公司 神经网络运算装置及方法
US20180189237A1 (en) * 2016-12-30 2018-07-05 Intel Corporation Winograd algorithm on a matrix processing architecture
CN108765247A (zh) * 2018-05-15 2018-11-06 腾讯科技(深圳)有限公司 图像处理方法、装置、存储介质及设备

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107992940A (zh) * 2017-12-12 2018-05-04 郑州云海信息技术有限公司 一种卷积神经网络在fpga上的实现方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229654A (zh) * 2016-12-14 2018-06-29 上海寒武纪信息科技有限公司 神经网络卷积运算装置及方法
CN108229656A (zh) * 2016-12-14 2018-06-29 上海寒武纪信息科技有限公司 神经网络运算装置及方法
US20180189237A1 (en) * 2016-12-30 2018-07-05 Intel Corporation Winograd algorithm on a matrix processing architecture
CN108765247A (zh) * 2018-05-15 2018-11-06 腾讯科技(深圳)有限公司 图像处理方法、装置、存储介质及设备

Also Published As

Publication number Publication date
CN111382854B (zh) 2021-03-23
CN111382854A (zh) 2020-07-07

Similar Documents

Publication Publication Date Title
US20210166113A1 (en) Method for neural network and apparatus performing same method
WO2021036905A1 (zh) 数据处理方法、装置、计算机设备和存储介质
WO2017185412A1 (zh) 一种支持较少位数定点数的神经网络运算的装置和方法
KR102399548B1 (ko) 뉴럴 네트워크를 위한 방법 및 그 방법을 수행하는 장치
WO2017185414A1 (zh) 一种支持较少位数浮点数的神经网络运算的装置和方法
CN113011571B (zh) 基于Transformer模型的INT8离线量化及整数推断方法
WO2022111002A1 (zh) 用于训练神经网络的方法、设备和计算机可读存储介质
US11842250B2 (en) Quantum error correction decoding system and method, fault-tolerant quantum error correction system, and chip
WO2023231794A1 (zh) 一种神经网络参数量化方法和装置
WO2023020456A1 (zh) 网络模型的量化方法、装置、设备和存储介质
WO2020249085A1 (zh) 基于神经网络计算的数据处理方法和装置
WO2020135093A1 (zh) 卷积神经网络处理方法、装置、设备及存储介质
WO2021081854A1 (zh) 一种卷积运算电路和卷积运算方法
CN112926646B (zh) 数据批量标准化方法、计算设备和计算机可读存储介质
Wu et al. Accuracy tolerant neural networks under aggressive power optimization
CN113313253A (zh) 神经网络压缩方法、数据处理方法、装置及计算机设备
US11263517B1 (en) Flexible weight expansion
WO2024060727A1 (zh) 神经网络模型的训练方法、装置、设备及系统
CN117911794B (zh) 用于图像分类的模型获得方法、装置、电子设备和存储介质
TWI795135B (zh) 神經網路模型的量化方法及深度學習加速器
WO2022257920A1 (zh) 优化深度神经网络的参数的处理系统、集成电路及板卡
US20240134930A1 (en) Method and apparatus for neural network weight block compression in a compute accelerator
WO2021036412A1 (zh) 数据处理方法、装置、计算机设备和存储介质
Furuta et al. An Efficient Implementation of FPGA-based Object Detection Using Multi-scale Attention
Gensheng et al. Multi-output support vector machine regression and its online learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19905484

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19905484

Country of ref document: EP

Kind code of ref document: A1