CN111967574A - Convolutional neural network training method based on tensor singular value delimitation - Google Patents
Convolutional neural network training method based on tensor singular value delimitation Download PDFInfo
- Publication number
- CN111967574A CN111967574A CN202010700940.7A CN202010700940A CN111967574A CN 111967574 A CN111967574 A CN 111967574A CN 202010700940 A CN202010700940 A CN 202010700940A CN 111967574 A CN111967574 A CN 111967574A
- Authority
- CN
- China
- Prior art keywords
- tensor
- weight
- singular value
- layer
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 51
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000012549 training Methods 0.000 title claims abstract description 33
- 239000011159 matrix material Substances 0.000 claims abstract description 73
- 238000011478 gradient descent method Methods 0.000 claims abstract description 7
- 238000000354 decomposition reaction Methods 0.000 claims description 14
- 210000002569 neuron Anatomy 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 abstract description 2
- 238000005457 optimization Methods 0.000 description 7
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000002040 relaxant effect Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a convolutional neural network training method based on tensor singular value delimitation, which comprises the following steps of: s1, initializing weights of the convolutional neural network to enable the weight matrixes of the fully-connected layer and the convolutional layer to be orthogonal, and enabling singular values of weight tensors of the convolutional layer to be equal; s2, training the convolutional neural network by using a random gradient descent method or a deformation thereof; and S3, after each training iteration, alternately updating matrix singular value delimitation and tensor singular value delimitation for the convolution layer weight, updating matrix singular value delimitation for the full-connection layer weight, and updating delimitation for the batch normalization layer weight. The invention provides a tensor structure which adds orthogonal constraint to the weight tensor, not only keeps network energy, but also does not damage the weight. Aiming at the constraint of the orthogonal tensor, the invention provides that the threshold value limitation is carried out on the singular value of the weight tensor, the training of the orthogonal tensor network is realized, and the performance of the image classification network is improved.
Description
Technical Field
The invention belongs to the field of artificial intelligence, relates to machine learning and deep learning, and particularly relates to a convolutional neural network training method based on tensor singular value delimitation.
Background
Deep convolutional neural networks have enjoyed great success in many applications, such as image classification and object detection. Convolutional neural networks have been successful primarily because they have a powerful expressive power to represent complex relationships from input to output. But the strong expression capacity on the other hand also increases the risk of overfitting. To mitigate overfitting, researchers have introduced many tricks such as weight attenuation, dropout, and label perturbation. The cascaded deep hierarchical structure of the convolutional neural network also brings problems of gradient disappearance/explosion, saddle point diffusion and the like, and the training is difficult. In order to solve these problems, methods such as parameter initialization, direct connection (short) and BN have been proposed to simplify the optimization of convolutional neural networks.
Orthogonality is also used to solve the over-fitting and optimization problem of deep convolutional neural networks. Theoretically, it has been proved that when the singular values of the weight matrix are equal, the convolutional neural network can achieve the optimal generalization error, and the risk of overfitting is reduced. Orthogonality also limits the magnitude of the gradient and stabilizes the distribution of the activation outputs of the layers, making optimization more efficient. There are many convolutional neural network methods proposed that use orthogonality for constraints. The soft orthogonality regularization constrains the Graham matrix of the weight matrix to be near the identity matrix under the F-norm. By analyzing the limited equidistant characteristic, the F-norm is replaced by the frequency spectrum norm, and the performance is improved. Since the orthogonal matrix is located on the Stiefel manifold, projection gradient descent also becomes a method for solving the deep learning optimization problem with strict orthogonal constraint. The linear module is added with a module for converting a common weight matrix into an orthogonal matrix on the network structure design, and can be optimized by a common gradient descent method. By relaxing the hard orthogonal constraint, the singular value bounding method limits all singular values of each weight matrix to a threshold range near 1 after each training period (i.e. epoch, the iteration number required for traversing all the training data once) is finished, and the fast solution of the orthogonal constraint is realized.
The orthogonal constraint is successfully applied to convolutional neural network training, for example, the stability of convergence during convolutional neural network training can be increased by adding the orthogonal constraint, an orthogonal regularization loss function is utilized in the convolutional neural network training process, singular value delimitation is performed after weights are expanded into a two-dimensional matrix, the weights of the convolutional neural network are expanded into the two-dimensional matrix and then multiplied by the pseudo inverse of the two-dimensional matrix, and a new orthogonal constraint is formed on the convolution kernel weights of the convolutional neural network. However, in this method, the weight tensor of the convolutional layer is expanded into a matrix during constraint, and the structural characteristics of the tensor are destroyed. The tensor-tensor product is a newly defined tensor operation, which can deduce the properties of a series of analog matrixes, and is concerned by the field of machine learning in recent years, and has been successfully applied to the aspect of keeping the structural characteristics of the tensor. The robust principal component analysis is popularized to tensor by combining tensor-Singular Value Decomposition (tensor-Singular Value Decomposition), and the method has a good effect when being applied to image video processing. The invention discloses a tensor singular value decomposition method based on tensor-tensor product derivation, which is used for popularizing orthogonal matrix constraint to tensor and realizing a convolutional neural network training method based on tensor structure constraint.
Disclosure of Invention
The invention provides a convolutional neural network training method based on tensor singular value delimitation, which aims to add structural constraints of matrixes and tensors to a convolutional neural network, improve the network performance, integrate orthogonal constraints, simultaneously keep a tensor structure of network weight, and stably improve the performance of the convolutional neural network.
On the aspect of constraint condition design of an objective function, the invention adds constraint of tensor singular value equality on the basis of the constraint condition of weight matrix orthogonality, and theoretically ensures that a weight tensor obtained by network solution is an orthogonal tensor or a product of the orthogonal tensor and a constant. In the optimization process of training, after each time of a plurality of suboptimal iterations, matrix singular values and tensor singular values of weights are respectively limited within a certain threshold range, so that the solved weights approximately meet constraint conditions of an objective function.
The invention is realized by at least one of the following technical schemes.
A convolutional neural network training method based on tensor singular value delimitation comprises the following steps:
s1, initializing weights of the convolutional neural network to enable the weight matrixes of the fully-connected layer and the convolutional layer to be orthogonal, and enabling singular values of weight tensors of the convolutional layer to be equal;
s2, training the initialized convolutional neural network by using a Stochastic Gradient Descent method (SGD for short) or a deformation thereof (including SGD with Momentum, SGD with Nesterov Momentum, AdaGrad, Adadelta, RMSprop and Adam);
and S3, after each training iteration, alternately updating matrix singular value delimitation and tensor singular value delimitation for the convolutional layer weight of the convolutional neural network, updating matrix singular value delimitation for the fully-connected layer weight of the convolutional neural network, and updating delimitation for the weight of a Batch standardization (Batch Normalization, hereinafter referred to as BN) layer. If the loss function is converged, the training is finished; if the loss function has not converged, the process returns to step S2.
Further, step S1 performs a constraint that matrix orthogonality and tensor singular values are equal for initialization of convolutional neural network weights.
Further, after the full-link layer weights of the convolutional neural network are initialized randomly in step S3, all singular values of the full-link layer weight matrix are limited to 1, so that the full-link layer weight matrix is orthogonal.
Further, after the convolutional layer weights of the convolutional neural network are initialized randomly in step S3, the following operations are performed alternately until convergence:
1) limiting all singular values of the convolutional layer weight matrix to 1 so that the convolutional layer weight matrix is orthogonal;
2) on the premise of keeping the Frobenius norm (hereinafter referred to as F norm) unchanged, all singular values of the convolutional layer weight tensor are restricted to be equal, so that the convolutional layer weight tensor is an orthogonal tensor or a product of the orthogonal tensor and a constant.
Further, in step S2, after performing training iterations on the initialized convolutional neural network for several times by using a random gradient descent method or its deformation, the weights of the convolutional neural network are updated by using threshold delimitation.
Further, the step S3 of performing matrix singular value delimitation on the weight matrices of the convolutional layer and the fully-connected layer includes the following steps:
a) performing matrix singular value decomposition on the weight matrix;
b) performing threshold value constraint on each singular value of the weight matrix, so that each singular value is close to 1;
c) and reconstructing a weight matrix according to the updated singular value.
Further, the tensor singular value delimitation of step S3 includes the following steps:
keeping the F-norm of the tensor equal to the F-norm of the corresponding orthogonal matrix, and calculating expected singular values when all tensor singular values are equal;
carrying out tensor singular value decomposition on the weight tensor;
carrying out threshold value constraint on each singular value of the weight tensor to enable each singular value to be close to the expected singular value obtained through calculation;
and fourthly, reconstructing the weight tensor according to the updated singular value.
Further, the step S3 of thresholding the weight of the BN layer includes the following steps:
calculating the average value of the quotient of each neuron weight and the input standard layer;
and (II) limiting the quotient of each neuron weight and the input standard layer to be close to the corresponding mean value, and obtaining the weight of a new BN layer.
Compared with the prior art, the invention has the beneficial effects that:
the method comprises the steps of constraining the weight tensor of the convolutional neural network, and reserving structural information of the weight tensor on a model structure; compared with the method for carrying out matrix orthogonal constraint on the weight matrix of the convolutional neural network, the optimization performance reduces the solving space of the network weight and simplifies the optimization. The invention effectively improves the performance of the convolutional neural network.
Drawings
Fig. 1 is a flowchart of a training process of a convolutional neural network training method based on tensor singular value delimitation in this embodiment;
FIG. 2 is a diagram illustrating singular value delimitation of a matrix according to this embodiment;
fig. 3 is a schematic diagram of singular value delimitation of the tensor of the embodiment.
Detailed Description
The present invention will be described in further detail below with reference to specific embodiments, but the embodiments of the present invention are not limited thereto.
The principle of the invention comprises: on the basis of orthogonal constraint on the weight matrixes of the convolutional layers and the fully-connected layers of the convolutional neural network, the singular values of the weight tensors of the convolutional layers are further constrained to be equal, so that the structural characteristics of the weight tensors are reserved. Threshold value limitation is carried out on singular values of the matrix and the tensor to approximately meet constraint conditions, a convolutional neural network training method is obtained, and network performance is improved.
As shown in fig. 1, a convolutional neural network training method based on tensor singular value delimitation includes the following steps:
s1, initializing weights of the convolutional neural network to enable the weight matrixes of the fully-connected layer and the convolutional layer to be orthogonal, and enabling singular values of weight tensors of the convolutional layer to be equal;
specifically, the weights of the convolutional layer and the fully-connected layer are initialized to random values, and then all singular values of the weight matrices of the convolutional layer and the fully-connected layer are set to be 1, so that the weight matrices are orthogonal matrices. In the convolutional layer, on the premise of keeping the F-norm constant, all singular values of the convolutional layer weight tensor are further made equal so that the convolutional layer weight tensor is the orthogonal tensor or the product of the orthogonal tensor and the constant, and the singular value setting of the weight matrix and the weight tensor of the convolutional layer is alternately performed until convergence.
S2, training the initialized convolution neural network by adopting a random gradient descent method or a variation thereof (including SGD with Momentum, SGD with Nesterov Momentum, AdaGrad, Adadelta, RMSprop and Adam). Every time a training period passes, the weights are updated according to the step S3.
And S3, updating the weight of the convolutional layer by utilizing matrix singular value delimitation and tensor singular value delimitation.
For convenience of description, the symbols involved are agreed upon. For any convolutional layer, the convolutional weight tensor isWhereinIs a three-dimensional real number tensor, the sizes of three dimensions of the tensor are respectively K, C and d2R represents a real number, C represents the number of input channels, K represents the number of convolution kernels, and d represents the size of the convolution kernels. Convolution weight tensorThe modulo-one expansion (mode-1 underfold) yields the corresponding weight matrix asWherein R represents a real number, K and Cd2Respectively representing the size of two dimensions of the matrix. For a fully connected layer, the weight structure is a matrix. For convenience of representation, the invention uniformly uses W epsilon R in the fully-connected matrix of the convolution layer and the fully-connected layerK×mDenotes that when it denotes a convolution matrix, m ═ Cd2。
Step S3 is specifically as follows:
(1) updating the weights of the convolutional layers according to the matrix singular values, as shown in fig. 2, includes the following steps:
first, Singular Value Decomposition (SVD) is performed on the convolutional layer weight matrix W to obtain W ═ U ∑ VTWhere U is unitary matrix of K × m order, Sigma is diagonal matrix of m × m order non-negative real numbers, the diagonal elements are singular values of W, and V is mUnitary matrix of order m, VTRepresenting the transpose of V.
Secondly, threshold limitation is carried out on each diagonal element of the singular value matrix sigma according to the following mode:
if σ isi>1+1Then σi=1+1;
If σ isi<1/(1+1) Then σi=1/(1+1);
If 1/(1+1)≤σi≤1+1Then σiKeeping the same;
wherein sigmaiThe ith diagonal element representing sigma,1representing smaller values, ranging from 0.1 to 0.5, for the singular value σiDelimitation is performed with constraints around 1.
Thirdly, calculating U 'sigma V' according to the new sigmaTA new convolutional layer weight matrix W is obtained.
(2) And updating the weight of the convolutional layer according to the tensor singular value. Convolving the Tensor with Tensor Singular Value Decomposition (t-SVD)The decomposition is carried out, and the decomposition is carried out,and for tensor singular valuesDelimitation is performed as shown in fig. 3. WhereinAndrespectively, dimension K x d2And the size C d2The orthogonal tensor of (a);is the frontal diagonal tensor (i.e., the tensor has all frontal slices being diagonal momentsMatrix) of size K × C × d2,Is a diagonal element of the first frontal slice ofTensor singular values of (a); is the tensor-tensor product. The singular value decomposition of the tensor in the actual operation can be obtained by utilizing Fourier transform and the singular value decomposition of a matrix, and the specific calculation process is as follows: first, the convolution weight tensor is alignedPerforming fast Fourier transform along the dimension of the convolution kernel to obtain the result after Fourier transformThe calculation process is recorded asSecondly, toEach of the frontal slice matrices of (a) is subjected to matrix singular value decomposition, i.e.Wherein,to representThe ith front-side slice matrix of (a),is a unitary matrix of order K x m,is a m x m order non-negative real diagonal matrix whose diagonal elements areThe singular value of (a) is,is a unitary matrix of order m x m,to representTransposing; finally, in pairsTensor for frontal sliceRespectively carrying out fast inverse Fourier transform along the dimension of the convolution kernel to obtainNamely, it isAs can be seen from the nature of the inverse fourier transform,is equal to all of the elements of the first front sliceMean value of corresponding positions, henceMay be delimited by pairingComprising the following steps:
phi is to convolution weight tensorPerforming fast Fourier transform along the dimension of the convolution kernel to obtain the result after Fourier transformNamely, it is
When i belongs to 1 to d2The following operations are performed:
Wherein,to representThe jth diagonal element of (a) is,2representing smaller values, ranging from 0.1 to 0.5, for singular valuesDelimitation is performed with constraints around 1.
Fourthly, to newPerforming inverse fast Fourier transform along the dimension of the convolution kernel size to obtain a new convolution weight tensor
(3) And (3) alternately carrying out a plurality of iterations of the step (1) and the step (2). The embodiment suggests 1.5 iterations (i.e. performing step (1), step (2), and step (1) followed by exiting the iteration), which can achieve a good balance between the calculation time and the final effect, but is not limited to 1.5 iterations in practical applications.
S4, updating the weight of the full connection layer by using matrix singular value delimitation, and the method comprises the following steps:
(1) SVD is carried out on the weight matrix W of the full connection layer to obtain W ═ U ∑ VT。
(2) For each diagonal element σ of ∑iThe threshold limitation is performed as follows:
if σ isi>1+, then σi=1+;
If σ isi<1/(1+), then σi=1/(1+);
If 1/(1 +). ltoreq. sigma.iLess than or equal to 1+, then sigmaiRemain unchanged.
(3) Calculating U sigma V from the new sigmaTNew W is obtained.
And S5, delimitation updating is carried out on the weight of the BN layer. Suppose the input of BN layer is h epsilon Rn(i.e., h is a vector of real numbers of dimension n, where R represents a real number), the operation of the BN layer can be expressed as:
BN(h)=ΥΦ(h-μ)+β,
wherein n represents the number of channels of the BN layer and is equal to the number of convolution kernels of the convolution layer connected with the BN layer in terms of value, namely n is equal to K, mu is the average value of the input of the batch of neurons, and the diagonal element of the diagonal matrix of phi is the standard deviation phi of the input of the batch of neuronsiY is a diagonal matrix whose diagonal elements may be learned BN layer weights viAnd β is a learnable BN layer bias term. The specific steps of the threshold limit updating of the BN layer are as follows:
(2) Each diagonal element v to the mean value yiThe threshold limitation is performed as follows:
Wherein upsilon isiIs the weight of the ith neuron of the BN layer, phiiIs the standard deviation of the ith neuron input for the batch,representing a smaller value (ranging between 0.1 and 0.5) with a weight v for the BN layeriDelimitation is carried out, and constraint is carried out near alpha;
and S6, repeatedly executing the steps S3 to S5 until the training of the convolutional neural network is converged.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (8)
1. A convolutional neural network training method based on tensor singular value delimitation is characterized by comprising the following steps:
s1, initializing weights of the convolutional neural network to enable the weight matrixes of the fully-connected layer and the convolutional layer to be orthogonal, and enabling singular values of weight tensors of the convolutional layer to be equal;
s2, training the initialized convolutional neural network by using a random gradient descent method or a deformation thereof;
s3, after each training iteration, alternately updating matrix singular value delimitation and tensor singular value delimitation for convolutional layer weights of the convolutional neural network, updating matrix singular value delimitation for full-connection layer weights of the convolutional neural network, updating delimitation for weights of batch standardized layers (BN layers), and finishing the training if loss functions are converged; if the loss function has not converged, the process returns to step S2.
2. The method of claim 1, wherein step S1 constrains the initialization of convolutional neural network weights to be matrix orthogonal and tensor singular value equal.
3. The method according to claim 2, wherein after the random initialization of the fully-connected layer weights of the convolutional neural network in step S3, all singular values of the fully-connected layer weight matrix are limited to 1, so that the fully-connected layer weight matrix is orthogonal.
4. The method according to claim 3, wherein the random initialization of convolutional layer weights of convolutional neural network in step S3 is followed by the following operations until convergence:
limiting all singular values of the convolutional layer weight matrix to 1 so that the convolutional layer weight matrix is orthogonal;
on the premise of keeping the Frobenius norm (hereinafter referred to as F norm) unchanged, all singular values of the convolutional layer weight tensor are restricted to be equal, so that the convolutional layer weight tensor is an orthogonal tensor or a product of the orthogonal tensor and a constant.
5. The method according to claim 1, wherein step S2 is implemented by performing several training iterations on the initialized convolutional neural network by using a stochastic gradient descent method or its variant, and then updating the weights of the convolutional neural network by using threshold bounding.
6. The method of claim 1, wherein the step S3 of performing matrix singular value delimitation on the weight matrices of the convolutional layer and the fully-connected layer comprises the steps of:
performing matrix singular value decomposition on the weight matrix;
performing threshold value constraint on each singular value of the weight matrix, so that each singular value is close to 1;
and reconstructing a weight matrix according to the updated singular value.
7. The method of claim 1, wherein the tensor singular value delimitation of step S3 comprises the steps of:
keeping the F-norm of the tensor equal to the F-norm of the corresponding orthogonal matrix, and calculating expected singular values when all tensor singular values are equal;
carrying out tensor singular value decomposition on the weight tensor;
carrying out threshold value constraint on each singular value of the weight tensor to enable each singular value to be close to the expected singular value obtained through calculation;
and fourthly, reconstructing the weight tensor according to the updated singular value.
8. The method of claim 1, wherein the step S3 of thresholding the weight of the BN layer comprises the steps of:
calculating the mean value of the quotient of each neuron weight and the input standard layer;
and limiting the quotient of each neuron weight and the input standard layer to be close to the corresponding mean value to obtain the weight of a new BN layer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010700940.7A CN111967574B (en) | 2020-07-20 | 2020-07-20 | Tensor singular value delimitation-based convolutional neural network training method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010700940.7A CN111967574B (en) | 2020-07-20 | 2020-07-20 | Tensor singular value delimitation-based convolutional neural network training method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111967574A true CN111967574A (en) | 2020-11-20 |
CN111967574B CN111967574B (en) | 2024-01-23 |
Family
ID=73362075
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010700940.7A Active CN111967574B (en) | 2020-07-20 | 2020-07-20 | Tensor singular value delimitation-based convolutional neural network training method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111967574B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113537492A (en) * | 2021-07-19 | 2021-10-22 | 第六镜科技(成都)有限公司 | Model training and data processing method, device, equipment, medium and product |
CN114638283A (en) * | 2022-02-11 | 2022-06-17 | 华南理工大学 | Orthogonal convolution neural network image identification method based on tensor optimization space |
CN117352049A (en) * | 2023-10-31 | 2024-01-05 | 河南大学 | Parameter efficient protein language model design method based on self-supervision learning and Kronecker product decomposition |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107092960A (en) * | 2017-04-17 | 2017-08-25 | 中国民航大学 | A kind of improved parallel channel convolutional neural networks training method |
CN107680044A (en) * | 2017-09-30 | 2018-02-09 | 福建帝视信息科技有限公司 | A kind of image super-resolution convolutional neural networks speed-up computation method |
CN107967516A (en) * | 2017-10-12 | 2018-04-27 | 中科视拓(北京)科技有限公司 | A kind of acceleration of neutral net based on trace norm constraint and compression method |
CN108537252A (en) * | 2018-03-21 | 2018-09-14 | 温州大学苍南研究院 | A kind of image noise elimination method based on new norm |
CN108649926A (en) * | 2018-05-11 | 2018-10-12 | 电子科技大学 | DAS data de-noising methods based on wavelet basis tensor rarefaction representation |
CN109214441A (en) * | 2018-08-23 | 2019-01-15 | 桂林电子科技大学 | A kind of fine granularity model recognition system and method |
-
2020
- 2020-07-20 CN CN202010700940.7A patent/CN111967574B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107092960A (en) * | 2017-04-17 | 2017-08-25 | 中国民航大学 | A kind of improved parallel channel convolutional neural networks training method |
CN107680044A (en) * | 2017-09-30 | 2018-02-09 | 福建帝视信息科技有限公司 | A kind of image super-resolution convolutional neural networks speed-up computation method |
CN107967516A (en) * | 2017-10-12 | 2018-04-27 | 中科视拓(北京)科技有限公司 | A kind of acceleration of neutral net based on trace norm constraint and compression method |
CN108537252A (en) * | 2018-03-21 | 2018-09-14 | 温州大学苍南研究院 | A kind of image noise elimination method based on new norm |
CN108649926A (en) * | 2018-05-11 | 2018-10-12 | 电子科技大学 | DAS data de-noising methods based on wavelet basis tensor rarefaction representation |
CN109214441A (en) * | 2018-08-23 | 2019-01-15 | 桂林电子科技大学 | A kind of fine granularity model recognition system and method |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113537492A (en) * | 2021-07-19 | 2021-10-22 | 第六镜科技(成都)有限公司 | Model training and data processing method, device, equipment, medium and product |
CN113537492B (en) * | 2021-07-19 | 2024-04-26 | 第六镜科技(成都)有限公司 | Model training and data processing method, device, equipment, medium and product |
CN114638283A (en) * | 2022-02-11 | 2022-06-17 | 华南理工大学 | Orthogonal convolution neural network image identification method based on tensor optimization space |
CN114638283B (en) * | 2022-02-11 | 2024-08-16 | 华南理工大学 | Tensor optimization space-based orthogonal convolutional neural network image recognition method |
CN117352049A (en) * | 2023-10-31 | 2024-01-05 | 河南大学 | Parameter efficient protein language model design method based on self-supervision learning and Kronecker product decomposition |
CN117352049B (en) * | 2023-10-31 | 2024-09-13 | 河南大学 | Parameter efficient protein language model design method based on self-supervision learning and Kronecker product decomposition |
Also Published As
Publication number | Publication date |
---|---|
CN111967574B (en) | 2024-01-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cheng et al. | Model compression and acceleration for deep neural networks: The principles, progress, and challenges | |
Yang et al. | Feed-forward neural network training using sparse representation | |
Dziugaite et al. | Training generative neural networks via maximum mean discrepancy optimization | |
CN111967574A (en) | Convolutional neural network training method based on tensor singular value delimitation | |
Tang et al. | Automatic sparse connectivity learning for neural networks | |
CN110490227B (en) | Feature conversion-based few-sample image classification method | |
Tang et al. | Analysis dictionary learning based classification: Structure for robustness | |
Wu et al. | Improved expressivity through dendritic neural networks | |
CN111224905B (en) | Multi-user detection method based on convolution residual error network in large-scale Internet of things | |
CN113408610B (en) | Image identification method based on adaptive matrix iteration extreme learning machine | |
CN110717519A (en) | Training, feature extraction and classification method, device and storage medium | |
CN111144500A (en) | Differential privacy deep learning classification method based on analytic Gaussian mechanism | |
Bungert et al. | A Bregman learning framework for sparse neural networks | |
CN112949610A (en) | Improved Elman neural network prediction method based on noise reduction algorithm | |
Rojas et al. | Blind source separation in post-nonlinear mixtures using competitive learning, simulated annealing, and a genetic algorithm | |
CN118196231B (en) | Lifelong learning draft method based on concept segmentation | |
CN113240105A (en) | Power grid steady state discrimination method based on graph neural network pooling | |
CN114638283B (en) | Tensor optimization space-based orthogonal convolutional neural network image recognition method | |
Kim et al. | Revisiting orthogonality regularization: a study for convolutional neural networks in image classification | |
CN117033985A (en) | Motor imagery electroencephalogram classification method based on ResCNN-BiGRU | |
Bilski et al. | Towards a very fast feedforward multilayer neural networks training algorithm | |
Yoshida et al. | Tropical neural networks and its applications to classifying phylogenetic trees | |
Ben et al. | An adaptive neural networks formulation for the two-dimensional principal component analysis | |
Upadhya et al. | Learning RBM with a DC programming approach | |
CN110288002A (en) | A kind of image classification method based on sparse Orthogonal Neural Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |