CN112668717B

CN112668717B - Data processing method and device oriented to neural network model optimization

Info

Publication number: CN112668717B
Application number: CN202110002440.0A
Authority: CN
Inventors: 李海峰; 徐聪; 马琳; 丰上; 薄洪健; 陈婧; 王子豪; 李洪伟; 孙聪珊; 徐忠亮; 朱泓嘉; 张子卿; 熊文静; 丁施航; 姜文浩
Original assignee: Harbin Institute of Technology
Current assignee: Harbin Institute of Technology
Priority date: 2021-01-04
Filing date: 2021-01-04
Publication date: 2023-06-02
Anticipated expiration: 2041-01-04
Also published as: CN112668717A

Abstract

The invention discloses a data processing method and device oriented to neural network model optimization. The method comprises the following steps: and the original data is mapped into a higher-order Cartesian expansion space with stronger expression capability and more information by calculating a higher-order Cartesian expansion term. The device comprises: the device comprises an input module, a Cartesian expansion calculation module and an output module; the input module is used for determining and receiving multidimensional data for calculation, and comprises a dimension of the input data and a numerical value of each dimension; the Cartesian expansion calculation module is used for carrying out Cartesian expansion calculation on the multidimensional input data determined by the input module; and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result. The invention has the advantages that: on the premise of not influencing the model effect, the difficulty of subsequent model learning is reduced, the learning efficiency is improved, and the convenience of distributed parallel computing is provided.

Description

Data processing method and device oriented to neural network model optimization

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a data processing method and device oriented to neural network model optimization.

Background

Machine Learning (Learning) is a discipline that is specialized in studying how computers simulate or implement Learning behavior of humans to acquire new knowledge or skills, reorganizing existing knowledge structures to continuously improve their own performance. In the field of machine learning, various machine learning methods typified by deep neural networks have been used with great success. Neural network models such as CNN and RNN which are most widely applied have good recognition effects in the fields of computer vision, natural language processing, voice recognition and the like. In the successful machine learning application, the data processing method is not separated, especially in the training of the neural network model, the model has a severe requirement on input data, and an efficient and reasonable data processing method is needed to improve the capacity and training efficiency of the model. In particular, the current artificial intelligence technology using a deep neural network as a main tool generally adopts a deep structure, and has some unavoidable disadvantages and problems. When the depth of the neural network structure becomes large, the problem of gradient disappearance or gradient explosion can occur in the training process; meanwhile, a large amount of training data is required for training the neural network, so that the timeliness of training is a very critical problem, the deep structure is a structure which is unfavorable for distributed parallel computing in terms of topological relation, and the timeliness is a bottleneck.

In the existing machine learning algorithm, the model is trained by directly using multidimensional data composed of manually designed features. In the aspect of data preprocessing, data preprocessing is mostly performed according to the format normalization of the data and the requirements of a related model. Such as normalization, regularization, and related shifting, rotation, etc. operations in the neural network to augment the training set.

The above processing method only changes the expression form of the data, but does not process the information in the data more finely, so that the structure of the model (especially the structure of the neural network model) and the training process cannot be optimized at the data analysis processing level.

Other methods such as cosine transform and Principal Component Analysis (PCA) are mainly used for dimension reduction and feature screening of input data. By reducing the dimension of input data, the model complexity is reduced in a mode of reserving the dimension playing a key role in the model effect, and the data processing efficiency and the model accuracy are improved.

The dimension reduction processing method of the data only screens the data dimension, and can play a role of optimizing the model to a certain extent, but the expression capacity of the data is not improved, the model cannot be optimized in structure only by optimizing the data amount, and the training efficiency of the model is not improved fundamentally.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a data processing method and device oriented to neural network model optimization, and solves the defects in the prior art.

In order to achieve the above object, the present invention adopts the following technical scheme:

a data processing method oriented to neural network model optimization comprises the following steps:

1) Receiving input data:

for input vector data x= (X) ₁ ,x ₂ ,…,x _N ) Wherein x is ₁ ,x ₂ ,…,x _N For each dimension of the input vector, a data dimension N and a value x of an ith dimension are determined _i 。

2) Full P-th order Cartesian dilation operation:

2.1 determination of the highest order P

The value of P is determined based on the data dimensions and the specific computing hardware conditions. When the data dimension is larger (greater than 1000), P is preferably 2 to 3, otherwise, the data dimension can be determined according to the actually deployed computing hardware memory M and the computing precision lambda, and the P value corresponding to the maximum expansion dimension which can be accommodated by the memory is preferably p=

2.2 calculation of the Cartesian expansion term of the various orders

Constructing a k-th order Cartesian expansion term from the P values

Wherein P is _j For the j-th dimension x _j Is a non-negative integer and satisfies +.>

And according to the Cartesian expansion terms, all possible product values of k under different values between 1 and P are calculated respectively, namely all Cartesian expansion terms from 1 st order to P th order which are composed of each dimension.

3) Outputting a result:

after the calculation of the step 2) is completed, all obtained 1-P-order Cartesian expansion results are arranged together according to a certain sequence to form a resultVector s= (S) ₁ ,s ₂ ,…,s _M ) Where s represents each calculated Cartesian expansion term and M represents the dimension of the result vector. And finally outputs it.

Preferably, the operation for the P-th order cartesian expansion in step 2) may be implemented by multiplication of a matrix, specifically as follows:

for an input vector x= (X) of dimension n ₁ ,x ₂ ,…,x _N ) The result of the 2-order Cartesian expansion is a matrix X in the form of (N, N) ^T The element in X, the 3-order cartesian expansion result is a matrix formed by multiplying each of N column vectors in the 2-order result by X, so that the N matrices form a 3-dimensional matrix in the form of (N, N). Calculation of higher order terms and so on, the Cartesian expansion result matrix is increased by one dimension every time the first order is increased until the result of the P-order Cartesian expansion is calculated.

Preferably, before the result vector S in step 2) is output, the dimension thereof is screened by a dimension reduction method, the dimension with relatively low importance is removed, and the screened vector is output as a result.

The invention also discloses a data processing device, which comprises: the device comprises an input module, a Cartesian expansion calculation module and an output module;

the input module is used for determining and receiving multidimensional data for calculation, and comprises a dimension of the input data and a numerical value of each dimension;

the Cartesian expansion calculation module is used for carrying out 1-order to P-order Cartesian expansion calculation on the multidimensional input data determined by the input module;

and the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result of the Cartesian expansion calculation module.

Further, the cartesian expansion calculation module includes: an expansion order unit and a multiplication operation unit;

an expansion order unit, configured to set a highest order for cartesian expansion according to a specific problem, a computing device, and the like;

and the multiplication operation unit is used for calculating 1 st-order to P nd Cartesian expansion terms between the dimensions of the multidimensional data provided by the input module.

Compared with the prior art, the invention has the advantages that:

the input data is subjected to dimension transformation by utilizing a multi-order Cartesian expansion algorithm, and the original input data is mapped into a Cartesian expansion space with higher order, so that the expression capacity of the data and the degree of distinction between different categories are improved, the same data change function as a deep neural network is realized, the neural network is further enabled to be possibly optimized from the deep structure to the breadth structure, distributed parallel calculation is supported more effectively, the training efficiency and effect of a machine learning model are improved under the condition that the training data quantity and the computing capacity are the same, and important engineering application significance and research value exist in the fields of artificial intelligence and pattern recognition analysis.

Drawings

FIG. 1 is a schematic diagram of Cartesian expansion data processing according to an embodiment of the present invention;

FIG. 2 is a flow chart of a data processing apparatus according to an embodiment of the present invention;

fig. 3 is a flow chart of a 3-order cartesian expansion matrix multiplication implementation of an embodiment of the invention.

Detailed Description

The invention will be described in further detail below with reference to the accompanying drawings and by way of examples in order to make the objects, technical solutions and advantages of the invention more apparent.

Example 1

As shown in fig. 1, a data processing method for optimizing a neural network model includes the following steps:

1) Receiving input data:

2) Full P-th order Cartesian dilation operation:

2.1 determination of the highest order P

The value of P is determined based on the data dimensions and the specific computing hardware conditions. When the data dimension is larger (more than 1000), P is preferably 2 to 3, otherwise, the data dimension can be determined according to the actually deployed computing hardware memory M and the computing precision lambda, and the P value corresponding to the maximum expansion dimension which can be accommodated by the memory can be taken, namely

2.2 calculation of the Cartesian expansion term of the various orders

Constructing a k-th order Cartesian expansion term from the P values

It should be appreciated that the round-robin enumeration is used to calculate all possible Cartesian expansion terms, so that repeated terms due to multiplicative exchange laws can be avoided in the enumeration of all the final Cartesian expansion terms.

3) Outputting a result:

after the calculation in the step 2) is completed, all obtained 1-P-order Cartesian expansion results are arranged together according to a certain sequence to form a result vector S= (S) ₁ ,s ₂ ,…,s _M ) Where s represents each calculated Cartesian expansion term and M represents the dimension of the result vector. And finally outputs it.

In the step of outputting the result, before the result vector S is output, a dimension filtering operation may be performed on the result vector S. Because of the different values of P, the calculation method of the cartesian expansion of the P-order, and the different specific problems, not every dimension of the resulting vector S can play an important role. Therefore, before outputting the result vector, the dimension thereof may be filtered by a dimension reduction method such as cosine transform, principal Component Analysis (PCA), etc., the dimension of which the importance degree is relatively low is removed, and the filtered vector is output as a result.

As shown in fig. 2, the data processing apparatus includes: the device comprises an input module, a Cartesian expansion calculation module and an output module;

The cartesian expansion calculation module includes: an expansion order unit and a multiplication operation unit;

In the processing flow, the original data firstly enters an input module to enter the processing flow, then the data is sent to a Cartesian expansion calculation module, the Cartesian expansion results from 1 to P steps are calculated, then the Cartesian expansion results of each step are sent to an output module to finish high-dimensional fusion, and finally output data is formed.

Example 2

This example only illustrates the differences from example 1;

the operation for the P-th order cartesian expansion may be achieved by multiplication of a matrix (tensor), as shown in fig. 3, specifically as follows:

The cross terms obtained in this way are repeated in a large number, but the problem of low calculation efficiency caused by a large number of loops can be effectively avoided.

Those of ordinary skill in the art will appreciate that the embodiments described herein are intended to aid the reader in understanding the practice of the invention and that the scope of the invention is not limited to such specific statements and embodiments. Those of ordinary skill in the art can make various other specific modifications and combinations from the teachings of the present disclosure without departing from the spirit thereof, and such modifications and combinations remain within the scope of the present disclosure.

Claims

1. The data processing method oriented to the neural network model optimization is characterized by comprising the following steps of:

receiving input data:

for input vector data x= (X) ₁ ,x ₂ ,…,x _N ) Wherein x is ₁ ,x ₂ ,…,x _N For each dimension of the input vector, a data dimension N and a value x of an ith dimension are determined _i ；

Determining the highest order P

Determining the value of P according to the data dimension and the specific computing hardware condition; when the data dimension is more than 1000, the value range of P is 2-3, otherwise,

wherein, the liquid crystal display device comprises a liquid crystal display device,m is the memory of the computing hardware, and lambda is the computing precision;

computing Cartesian expansion terms of each order

Constructing a k-th order Cartesian expansion term from the P values

According to the Cartesian expansion terms, respectively calculating all Cartesian expansion terms from 1 st order to P th order;

outputting a result:

after the calculation in the step 2) is completed, all obtained 1-P-order Cartesian expansion results are sequentially arranged together to form a result vector S= (S) ₁ ,s ₂ ,…,s _M ) Outputting the output; where s represents each calculated Cartesian expansion term and M represents the dimension of the result vector.

2. A data processing method according to claim 1, characterized in that: the operation of step 2) for the P-th order cartesian expansion may be implemented by multiplication of a matrix, specifically as follows:

for an input vector x= (X) of dimension n ₁ ,x ₂ ,…,x _N ) The result of the 2-order Cartesian expansion is a matrix X in the form of (N, N) ^T The element in X, its 3-order Cartesian expansion result is that each column vector in N column vectors in 2-order result is multiplied by X to form a matrix of (N, N), so that N matrices form a 3-dimensional matrix of (N, N, N); calculation of higher order terms and so on, the Cartesian expansion result matrix is increased by one dimension every time the first order is increased until the result of the P-order Cartesian expansion is calculated.

3. A data processing method according to claim 1, characterized in that: before the result vector S in the step 2) is output, the dimension of the result vector S is screened by a dimension reduction method, the dimension with lower importance degree is removed, and the screened vector is output as a result.

4. The utility model provides a data processing device towards neural network model optimization which characterized in that: the data processing device is used for running the data processing method for optimizing the neural network model according to one of claims 1 to 3;

the data processing device includes: the device comprises an input module, a Cartesian expansion calculation module and an output module;

the output module is used for outputting high-dimensional data for subsequent processing according to the calculation result of the Cartesian expansion calculation module;

an expansion order unit for setting the highest order of the Cartesian expansion;