CN112668717B - Data processing method and device oriented to neural network model optimization - Google Patents

Data processing method and device oriented to neural network model optimization Download PDF

Info

Publication number
CN112668717B
CN112668717B CN202110002440.0A CN202110002440A CN112668717B CN 112668717 B CN112668717 B CN 112668717B CN 202110002440 A CN202110002440 A CN 202110002440A CN 112668717 B CN112668717 B CN 112668717B
Authority
CN
China
Prior art keywords
expansion
cartesian
order
dimension
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110002440.0A
Other languages
Chinese (zh)
Other versions
CN112668717A (en
Inventor
李海峰
徐聪
马琳
丰上
薄洪健
陈婧
王子豪
李洪伟
孙聪珊
徐忠亮
朱泓嘉
张子卿
熊文静
丁施航
姜文浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN202110002440.0A priority Critical patent/CN112668717B/en
Publication of CN112668717A publication Critical patent/CN112668717A/en
Application granted granted Critical
Publication of CN112668717B publication Critical patent/CN112668717B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Complex Calculations (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a data processing method and device oriented to neural network model optimization. The method comprises the following steps: and the original data is mapped into a higher-order Cartesian expansion space with stronger expression capability and more information by calculating a higher-order Cartesian expansion term. The device comprises: the device comprises an input module, a Cartesian expansion calculation module and an output module; the input module is used for determining and receiving multidimensional data for calculation, and comprises a dimension of the input data and a numerical value of each dimension; the Cartesian expansion calculation module is used for carrying out Cartesian expansion calculation on the multidimensional input data determined by the input module; and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result. The invention has the advantages that: on the premise of not influencing the model effect, the difficulty of subsequent model learning is reduced, the learning efficiency is improved, and the convenience of distributed parallel computing is provided.

Description

Data processing method and device oriented to neural network model optimization
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a data processing method and device oriented to neural network model optimization.
Background
Machine Learning (Learning) is a discipline that is specialized in studying how computers simulate or implement Learning behavior of humans to acquire new knowledge or skills, reorganizing existing knowledge structures to continuously improve their own performance. In the field of machine learning, various machine learning methods typified by deep neural networks have been used with great success. Neural network models such as CNN and RNN which are most widely applied have good recognition effects in the fields of computer vision, natural language processing, voice recognition and the like. In the successful machine learning application, the data processing method is not separated, especially in the training of the neural network model, the model has a severe requirement on input data, and an efficient and reasonable data processing method is needed to improve the capacity and training efficiency of the model. In particular, the current artificial intelligence technology using a deep neural network as a main tool generally adopts a deep structure, and has some unavoidable disadvantages and problems. When the depth of the neural network structure becomes large, the problem of gradient disappearance or gradient explosion can occur in the training process; meanwhile, a large amount of training data is required for training the neural network, so that the timeliness of training is a very critical problem, the deep structure is a structure which is unfavorable for distributed parallel computing in terms of topological relation, and the timeliness is a bottleneck.
In the existing machine learning algorithm, the model is trained by directly using multidimensional data composed of manually designed features. In the aspect of data preprocessing, data preprocessing is mostly performed according to the format normalization of the data and the requirements of a related model. Such as normalization, regularization, and related shifting, rotation, etc. operations in the neural network to augment the training set.
The above processing method only changes the expression form of the data, but does not process the information in the data more finely, so that the structure of the model (especially the structure of the neural network model) and the training process cannot be optimized at the data analysis processing level.
Other methods such as cosine transform and Principal Component Analysis (PCA) are mainly used for dimension reduction and feature screening of input data. By reducing the dimension of input data, the model complexity is reduced in a mode of reserving the dimension playing a key role in the model effect, and the data processing efficiency and the model accuracy are improved.
The dimension reduction processing method of the data only screens the data dimension, and can play a role of optimizing the model to a certain extent, but the expression capacity of the data is not improved, the model cannot be optimized in structure only by optimizing the data amount, and the training efficiency of the model is not improved fundamentally.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a data processing method and device oriented to neural network model optimization, and solves the defects in the prior art.
In order to achieve the above object, the present invention adopts the following technical scheme:
a data processing method oriented to neural network model optimization comprises the following steps:
1) Receiving input data:
for input vector data x= (X) 1 ,x 2 ,…,x N ) Wherein x is 1 ,x 2 ,…,x N For each dimension of the input vector, a data dimension N and a value x of an ith dimension are determined i
2) Full P-th order Cartesian dilation operation:
2.1 determination of the highest order P
The value of P is determined based on the data dimensions and the specific computing hardware conditions. When the data dimension is larger (greater than 1000), P is preferably 2 to 3, otherwise, the data dimension can be determined according to the actually deployed computing hardware memory M and the computing precision lambda, and the P value corresponding to the maximum expansion dimension which can be accommodated by the memory is preferably p=
Figure GDA0004107271460000031
2.2 calculation of the Cartesian expansion term of the various orders
Constructing a k-th order Cartesian expansion term from the P values
Figure GDA0004107271460000032
Wherein P is j For the j-th dimension x j Is a non-negative integer and satisfies +.>
Figure GDA0004107271460000033
And according to the Cartesian expansion terms, all possible product values of k under different values between 1 and P are calculated respectively, namely all Cartesian expansion terms from 1 st order to P th order which are composed of each dimension.
3) Outputting a result:
after the calculation of the step 2) is completed, all obtained 1-P-order Cartesian expansion results are arranged together according to a certain sequence to form a resultVector s= (S) 1 ,s 2 ,…,s M ) Where s represents each calculated Cartesian expansion term and M represents the dimension of the result vector. And finally outputs it.
Preferably, the operation for the P-th order cartesian expansion in step 2) may be implemented by multiplication of a matrix, specifically as follows:
for an input vector x= (X) of dimension n 1 ,x 2 ,…,x N ) The result of the 2-order Cartesian expansion is a matrix X in the form of (N, N) T The element in X, the 3-order cartesian expansion result is a matrix formed by multiplying each of N column vectors in the 2-order result by X, so that the N matrices form a 3-dimensional matrix in the form of (N, N). Calculation of higher order terms and so on, the Cartesian expansion result matrix is increased by one dimension every time the first order is increased until the result of the P-order Cartesian expansion is calculated.
Preferably, before the result vector S in step 2) is output, the dimension thereof is screened by a dimension reduction method, the dimension with relatively low importance is removed, and the screened vector is output as a result.
The invention also discloses a data processing device, which comprises: the device comprises an input module, a Cartesian expansion calculation module and an output module;
the input module is used for determining and receiving multidimensional data for calculation, and comprises a dimension of the input data and a numerical value of each dimension;
the Cartesian expansion calculation module is used for carrying out 1-order to P-order Cartesian expansion calculation on the multidimensional input data determined by the input module;
and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result of the Cartesian expansion calculation module.
Further, the cartesian expansion calculation module includes: an expansion order unit and a multiplication operation unit;
an expansion order unit, configured to set a highest order for cartesian expansion according to a specific problem, a computing device, and the like;
and the multiplication operation unit is used for calculating 1 st-order to P nd Cartesian expansion terms between the dimensions of the multidimensional data provided by the input module.
Compared with the prior art, the invention has the advantages that:
the input data is subjected to dimension transformation by utilizing a multi-order Cartesian expansion algorithm, and the original input data is mapped into a Cartesian expansion space with higher order, so that the expression capacity of the data and the degree of distinction between different categories are improved, the same data change function as a deep neural network is realized, the neural network is further enabled to be possibly optimized from the deep structure to the breadth structure, distributed parallel calculation is supported more effectively, the training efficiency and effect of a machine learning model are improved under the condition that the training data quantity and the computing capacity are the same, and important engineering application significance and research value exist in the fields of artificial intelligence and pattern recognition analysis.
Drawings
FIG. 1 is a schematic diagram of Cartesian expansion data processing according to an embodiment of the present invention;
FIG. 2 is a flow chart of a data processing apparatus according to an embodiment of the present invention;
fig. 3 is a flow chart of a 3-order cartesian expansion matrix multiplication implementation of an embodiment of the invention.
Detailed Description
The invention will be described in further detail below with reference to the accompanying drawings and by way of examples in order to make the objects, technical solutions and advantages of the invention more apparent.
Example 1
As shown in fig. 1, a data processing method for optimizing a neural network model includes the following steps:
1) Receiving input data:
for input vector data x= (X) 1 ,x 2 ,…,x N ) Wherein x is 1 ,x 2 ,…,x N For each dimension of the input vector, a data dimension N and a value x of an ith dimension are determined i
2) Full P-th order Cartesian dilation operation:
2.1 determination of the highest order P
The value of P is determined based on the data dimensions and the specific computing hardware conditions. When the data dimension is larger (more than 1000), P is preferably 2 to 3, otherwise, the data dimension can be determined according to the actually deployed computing hardware memory M and the computing precision lambda, and the P value corresponding to the maximum expansion dimension which can be accommodated by the memory can be taken, namely
Figure GDA0004107271460000051
Figure GDA0004107271460000052
2.2 calculation of the Cartesian expansion term of the various orders
Constructing a k-th order Cartesian expansion term from the P values
Figure GDA0004107271460000053
Wherein P is j For the j-th dimension x j Is a non-negative integer and satisfies +.>
Figure GDA0004107271460000054
And according to the Cartesian expansion terms, all possible product values of k under different values between 1 and P are calculated respectively, namely all Cartesian expansion terms from 1 st order to P th order which are composed of each dimension.
It should be appreciated that the round-robin enumeration is used to calculate all possible Cartesian expansion terms, so that repeated terms due to multiplicative exchange laws can be avoided in the enumeration of all the final Cartesian expansion terms.
3) Outputting a result:
after the calculation in the step 2) is completed, all obtained 1-P-order Cartesian expansion results are arranged together according to a certain sequence to form a result vector S= (S) 1 ,s 2 ,…,s M ) Where s represents each calculated Cartesian expansion term and M represents the dimension of the result vector. And finally outputs it.
In the step of outputting the result, before the result vector S is output, a dimension filtering operation may be performed on the result vector S. Because of the different values of P, the calculation method of the cartesian expansion of the P-order, and the different specific problems, not every dimension of the resulting vector S can play an important role. Therefore, before outputting the result vector, the dimension thereof may be filtered by a dimension reduction method such as cosine transform, principal Component Analysis (PCA), etc., the dimension of which the importance degree is relatively low is removed, and the filtered vector is output as a result.
As shown in fig. 2, the data processing apparatus includes: the device comprises an input module, a Cartesian expansion calculation module and an output module;
the input module is used for determining and receiving multidimensional data for calculation, and comprises a dimension of the input data and a numerical value of each dimension;
the Cartesian expansion calculation module is used for carrying out 1-order to P-order Cartesian expansion calculation on the multidimensional input data determined by the input module;
and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result of the Cartesian expansion calculation module.
The cartesian expansion calculation module includes: an expansion order unit and a multiplication operation unit;
an expansion order unit, configured to set a highest order for cartesian expansion according to a specific problem, a computing device, and the like;
and the multiplication operation unit is used for calculating 1 st-order to P nd Cartesian expansion terms between the dimensions of the multidimensional data provided by the input module.
In the processing flow, the original data firstly enters an input module to enter the processing flow, then the data is sent to a Cartesian expansion calculation module, the Cartesian expansion results from 1 to P steps are calculated, then the Cartesian expansion results of each step are sent to an output module to finish high-dimensional fusion, and finally output data is formed.
Example 2
This example only illustrates the differences from example 1;
the operation for the P-th order cartesian expansion may be achieved by multiplication of a matrix (tensor), as shown in fig. 3, specifically as follows:
for an input vector x= (X) of dimension n 1 ,x 2 ,…,x N ) The result of the 2-order Cartesian expansion is a matrix X in the form of (N, N) T The element in X, the 3-order cartesian expansion result is a matrix formed by multiplying each of N column vectors in the 2-order result by X, so that the N matrices form a 3-dimensional matrix in the form of (N, N). Calculation of higher order terms and so on, the Cartesian expansion result matrix is increased by one dimension every time the first order is increased until the result of the P-order Cartesian expansion is calculated.
The cross terms obtained in this way are repeated in a large number, but the problem of low calculation efficiency caused by a large number of loops can be effectively avoided.
Those of ordinary skill in the art will appreciate that the embodiments described herein are intended to aid the reader in understanding the practice of the invention and that the scope of the invention is not limited to such specific statements and embodiments. Those of ordinary skill in the art can make various other specific modifications and combinations from the teachings of the present disclosure without departing from the spirit thereof, and such modifications and combinations remain within the scope of the present disclosure.

Claims (4)

1. The data processing method oriented to the neural network model optimization is characterized by comprising the following steps of:
receiving input data:
for input vector data x= (X) 1 ,x 2 ,…,x N ) Wherein x is 1 ,x 2 ,…,x N For each dimension of the input vector, a data dimension N and a value x of an ith dimension are determined i
Determining the highest order P
Determining the value of P according to the data dimension and the specific computing hardware condition; when the data dimension is more than 1000, the value range of P is 2-3, otherwise,
Figure QLYQS_1
wherein, the liquid crystal display device comprises a liquid crystal display device,m is the memory of the computing hardware, and lambda is the computing precision;
computing Cartesian expansion terms of each order
Constructing a k-th order Cartesian expansion term from the P values
Figure QLYQS_2
Wherein P is j For the j-th dimension x j Is a non-negative integer and satisfies +.>
Figure QLYQS_3
According to the Cartesian expansion terms, respectively calculating all Cartesian expansion terms from 1 st order to P th order;
outputting a result:
after the calculation in the step 2) is completed, all obtained 1-P-order Cartesian expansion results are sequentially arranged together to form a result vector S= (S) 1 ,s 2 ,…,s M ) Outputting the output; where s represents each calculated Cartesian expansion term and M represents the dimension of the result vector.
2. A data processing method according to claim 1, characterized in that: the operation of step 2) for the P-th order cartesian expansion may be implemented by multiplication of a matrix, specifically as follows:
for an input vector x= (X) of dimension n 1 ,x 2 ,…,x N ) The result of the 2-order Cartesian expansion is a matrix X in the form of (N, N) T The element in X, its 3-order Cartesian expansion result is that each column vector in N column vectors in 2-order result is multiplied by X to form a matrix of (N, N), so that N matrices form a 3-dimensional matrix of (N, N, N); calculation of higher order terms and so on, the Cartesian expansion result matrix is increased by one dimension every time the first order is increased until the result of the P-order Cartesian expansion is calculated.
3. A data processing method according to claim 1, characterized in that: before the result vector S in the step 2) is output, the dimension of the result vector S is screened by a dimension reduction method, the dimension with lower importance degree is removed, and the screened vector is output as a result.
4. The utility model provides a data processing device towards neural network model optimization which characterized in that: the data processing device is used for running the data processing method for optimizing the neural network model according to one of claims 1 to 3;
the data processing device includes: the device comprises an input module, a Cartesian expansion calculation module and an output module;
the input module is used for determining and receiving multidimensional data for calculation, and comprises a dimension of the input data and a numerical value of each dimension;
the Cartesian expansion calculation module is used for carrying out 1-order to P-order Cartesian expansion calculation on the multidimensional input data determined by the input module;
the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result of the Cartesian expansion calculation module;
the cartesian expansion calculation module includes: an expansion order unit and a multiplication operation unit;
an expansion order unit for setting the highest order of the Cartesian expansion;
and the multiplication operation unit is used for calculating 1 st-order to P nd Cartesian expansion terms between the dimensions of the multidimensional data provided by the input module.
CN202110002440.0A 2021-01-04 2021-01-04 Data processing method and device oriented to neural network model optimization Active CN112668717B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110002440.0A CN112668717B (en) 2021-01-04 2021-01-04 Data processing method and device oriented to neural network model optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110002440.0A CN112668717B (en) 2021-01-04 2021-01-04 Data processing method and device oriented to neural network model optimization

Publications (2)

Publication Number Publication Date
CN112668717A CN112668717A (en) 2021-04-16
CN112668717B true CN112668717B (en) 2023-06-02

Family

ID=75412620

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110002440.0A Active CN112668717B (en) 2021-01-04 2021-01-04 Data processing method and device oriented to neural network model optimization

Country Status (1)

Country Link
CN (1) CN112668717B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250810A (en) * 2015-06-15 2016-12-21 摩福公司 By iris identification, individuality is identified and/or the method for certification
CN106887000A (en) * 2017-01-23 2017-06-23 上海联影医疗科技有限公司 The gridding processing method and its system of medical image
CN107729994A (en) * 2017-11-28 2018-02-23 北京地平线信息技术有限公司 The method and apparatus for performing the computing of the convolutional layer in convolutional neural networks
CN107832842A (en) * 2017-11-28 2018-03-23 北京地平线信息技术有限公司 The method and apparatus that convolution algorithm is performed for fold characteristics data
CN107944556A (en) * 2017-12-12 2018-04-20 电子科技大学 Deep neural network compression method based on block item tensor resolution
WO2018224690A1 (en) * 2017-06-09 2018-12-13 Deepmind Technologies Limited Generating discrete latent representations of input data items

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250810A (en) * 2015-06-15 2016-12-21 摩福公司 By iris identification, individuality is identified and/or the method for certification
CN106887000A (en) * 2017-01-23 2017-06-23 上海联影医疗科技有限公司 The gridding processing method and its system of medical image
WO2018224690A1 (en) * 2017-06-09 2018-12-13 Deepmind Technologies Limited Generating discrete latent representations of input data items
CN107729994A (en) * 2017-11-28 2018-02-23 北京地平线信息技术有限公司 The method and apparatus for performing the computing of the convolutional layer in convolutional neural networks
CN107832842A (en) * 2017-11-28 2018-03-23 北京地平线信息技术有限公司 The method and apparatus that convolution algorithm is performed for fold characteristics data
CN107944556A (en) * 2017-12-12 2018-04-20 电子科技大学 Deep neural network compression method based on block item tensor resolution

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MODENN: A Shallow Broad Neural Network Model Based on Multi-Order Descartes Expansion;Haifeng Li 等;《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》;第44卷(第12期);9417-9433 *
Predicting musically induced emotions from physiological inputs: linear and neura lnetwork models;Frank A.Russo 等;《frontiers in psychology》;第4卷;1-8 *
移动云中基于计算迁移的应用性能优化研究;疏官胜;《中国博士论文全文数据库信息科技辑》(第(2019)05期);I139-9 *

Also Published As

Publication number Publication date
CN112668717A (en) 2021-04-16

Similar Documents

Publication Publication Date Title
Zhou et al. Rethinking bottleneck structure for efficient mobile network design
EP4036803A1 (en) Neural network model processing method and apparatus, computer device, and storage medium
CN109657780A (en) A kind of model compression method based on beta pruning sequence Active Learning
Huang et al. Sndcnn: Self-normalizing deep cnns with scaled exponential linear units for speech recognition
CN112699247A (en) Knowledge representation learning framework based on multi-class cross entropy contrast completion coding
US20220335304A1 (en) System and Method for Automated Design Space Determination for Deep Neural Networks
CN116306793A (en) Self-supervision learning method with target task directivity based on comparison twin network
CN114925320B (en) Data processing method and related device
Qi et al. Learning low resource consumption cnn through pruning and quantization
CN112668717B (en) Data processing method and device oriented to neural network model optimization
CN116128019A (en) Parallel training method and device for transducer model
Xia et al. Efficient synthesis of compact deep neural networks
CN112214668B (en) Personalized financial service recommendation device and method based on big data
CN115062769A (en) Knowledge distillation-based model training method, device, equipment and storage medium
CN114723024A (en) Linear programming-based neural network mapping method for storage and calculation integrated chip
CN111737462A (en) Mass data entity similarity pair determination method and system
Liawatimena et al. Performance optimization of maxpool calculation using 4d rank tensor
Li et al. CUSNTF: A scalable sparse non-negative tensor factorization model for large-scale industrial applications on multi-GPU
CN113849592B (en) Text emotion classification method and device, electronic equipment and storage medium
TWI768497B (en) Intelligent processor, data processing method and storage medium
CN111814462B (en) Efficient lifelong relationship extraction method and system based on dynamic regularization
CN113449817B (en) Image classification implicit model acceleration training method based on phantom gradient
CN107341485A (en) Face identification method and device
CN117407793B (en) Parallelization strategy optimization method, system, equipment and medium for large language model
CN116957007A (en) Feature quantization method, device, medium and program product for neural network training

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant