CN116992196A

CN116992196A - Data processing method, system, equipment and medium based on cyclic dynamic expansion

Info

Publication number: CN116992196A
Application number: CN202311250053.4A
Authority: CN
Inventors: 鲁蔚征; 张峰; 陈跃国; 杜小勇
Original assignee: Renmin University of China
Current assignee: Renmin University of China
Priority date: 2023-09-26
Filing date: 2023-09-26
Publication date: 2023-11-03
Anticipated expiration: 2043-09-26
Also published as: CN116992196B

Abstract

The application relates to a data processing method, a system, equipment and a medium based on cyclic dynamic expansion, which comprises the following steps: initializing a parameter value of a pre-established target neural network model, and constructing a target neural differential equation by taking the target neural network model as a time step of a loop body; solving the constructed target neural differential equation by adopting a cyclic dynamic expansion algorithm to obtain a trained target neural network; and taking the current data to be processed as the input of the target neural network, and taking the output of the target neural network as the data processing result. The application can be widely applied to the field of data processing. According to the application, the dynamic expansion mode is used for expanding the circulation of the target nerve differential equation, so that the advantage of the running speed of the full expansion of the circulation is utilized, the advantage of the compiling speed of the non-expansion of the circulation is utilized, and the calculation speed can be effectively improved. Therefore, the application can be widely applied to the field of data processing.

Description

Data processing method, system, equipment and medium based on cyclic dynamic expansion

Technical Field

The application relates to a data processing method, a system, equipment and a medium based on cyclic dynamic expansion, belonging to the field of artificial intelligence and data processing.

Background

Scientific and engineering disciplines often model field problems using differential equations and solve them using numerical methods, differential equations and corresponding solvers are widely used in the fields of physics, chemistry, demographics, etc., almost covering all fields of science and engineering. Manually driven scientific research (AI for science) is mainly based on neural network algorithms to solve problems in the traditional science and engineering fields, wherein one important branch is to solve differential equations by using artificial intelligence technology, and combine extremely strong fitting performance of the neural network with differential equations describing physical laws. For example, a solution to a physical Problem, such as a Two Body Problem (Two Body Problem), i.e., gravitational motion of Two particles on a plane, may be performed using the Neural ordinary differential equation (Neural Ordinary Differential Equation, neural ODE). Since gravitational motion is a typical kinetic problem, it can be described by differential equation modeling, and after observation data is collected, it can be further solved using neural ordinary differential equations.

The solution method based on the neural ordinary differential equation is similar to the traditional differential equation solution method, discretizing is carried out on time, and then the value of each time step is calculated step by using a loop. The solution of the neural ordinary differential equation is actually to train a neural network with loops. Training of neural networks needs to be performed on heterogeneous chips (e.g., graphics Processing Unit, GPU), and in neural network compilers, there are two common ways of loop processing: the full expansion and the non-expansion of the cycle are respectively introduced as follows:

the loop is fully unfolded, operators of all calculation steps in the loop are compiled during compiling, and if the single-step calculation is complex and the number of the loop steps is large, the compiling time is long; the unfolded kernel program is triggered only once, the time for data handling and kernel program starting is greatly shortened, and the running time is shortened.

The compile time for which the loop is not unwound is short, each step of the loop determines whether the loop termination condition is reached, and the kernel is started, and if the number of loop steps is large, the number of data handling and kernel starts is greatly increased, and these time-consuming increases will increase the run time. The compile time and the run time together form a solving process of the neural differential equation, and whichever time is too long affects the total solving time of the neural differential equation.

Disclosure of Invention

In view of the above problems, an object of the present application is to provide a data processing method, system, device and medium based on cyclic dynamic expansion, which optimizes the calculation process of a neural differential equation based on an improved cyclic dynamic expansion algorithm, so as to effectively improve the calculation speed.

In order to achieve the above purpose, the present application adopts the following technical scheme:

in a first aspect, the present application provides a data processing method based on cyclic dynamic expansion, including the steps of:

initializing a parameter value of a pre-established target neural network model, and constructing a target neural differential equation by taking the target neural network as a time step of a loop body;

solving the constructed target neural differential equation by adopting a cyclic dynamic expansion algorithm to obtain a trained target neural network;

and taking the current data to be processed as the input of the target neural network, and taking the output of the target neural network as the data processing result.

Further, the method for solving the constructed target neural differential equation by adopting the cyclic dynamic expansion algorithm, wherein the solving process is a training process of the target neural network, and the method for obtaining the trained target neural network comprises the following steps:

extracting characteristics, compiling time and running time of a target nerve differential equation, and inputting a pre-trained automatic tuning model to obtain an optimal unfolding factor;

and training and solving the target neural differential equation by using an optimal expansion factor and a cyclic dynamic expansion algorithm to obtain a trained target neural network.

Further, the training of the autotune model includes:

randomly generating and running different types of nerve differential equations offline, and collecting the compiling time and the running time of each nerve differential equation;

extracting features of each nerve differential equation, and combining the feature extraction with the compiling time and the running time of each nerve differential equation to obtain a training sample set;

and performing offline training on a pre-established automatic tuning machine model by using a training sample set, and predicting an optimal unfolding factor.

Further, when the neural differential equations of different types are randomly generated and operated, the neural differential equations are realized by changing different configuration modes of the neural differential equations, and the configuration modes of the neural differential equations at least comprise: number of hidden layers, width of hidden layers, batch size, differential equation dimension, number of steps, and expansion factor.

Further, when the feature extraction is performed on each neural differential equation, the extraction content includes:

the performance characteristics comprise the reading byte number, writing byte number, floating point operation times and calculation intensity in the calculation process of the neural network model;

the neural network architecture features include a parameter number, a hidden layer width, and a batch size;

equation characteristics including differential equation dimension, number of steps;

a spreading factor.

Further, the automatic tuning machine model is obtained by training a decision tree model;

the automatic tuning machine model comprises an input module, a compiling time estimating module, a running time estimating module and a model output module;

the input module is used for acquiring the characteristics of the neural differential equation and sending the characteristics to the compiling time estimating module and the running time estimating module;

the compiling time estimating module and the running time estimating module are used for predicting the compiling time and the running time according to the characteristics of the neural differential equation to obtain the compiling time and the running time;

the model output module is used for obtaining the optimal unfolding factor according to the compiling time and the running time.

Further, the method for solving the constructed target neural differential equation by adopting the cyclic dynamic expansion algorithm to obtain a trained target neural network comprises the following steps:

2.2.1 Determining iteration parameters including step function, total step number, initial state and optimal expansion factor;

2.2.2 Equally dividing all the circulating steps into a plurality of blocks num_blocks according to the total steps and the optimal spreading factor, and assigning an initial state to a current state;

2.2.3 Let iteration number i=0, calculate the loop of the i-th block using loop block expansion algorithm;

2.2.4 Judging whether the iteration number i is smaller than num_blocks, if so, i=i+1, returning to the previous step, and if not, outputting a current calculation result;

2.2.5 Through forward and backward propagation, a trained target neural network is obtained.

In a second aspect, the present application provides a data processing system based on cyclic dynamic expansion, comprising:

the neural differential equation construction module is used for initializing a parameter value of a pre-established target neural network model and constructing a target neural differential equation by taking the target neural network as a time step of the loop body;

the differential equation solving module is used for solving the constructed target neural differential equation by adopting a cyclic dynamic expansion algorithm, and the solving process is the training process of the target neural network to obtain a trained target neural network;

the network prediction module is used for taking the current data to be processed as the input of the target neural network and taking the output of the target neural network as the data processing result.

In a third aspect, the present application provides a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the methods.

In a fourth aspect, the present application provides a computing device comprising: one or more processors, memory, and one or more programs, wherein one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods.

Due to the adoption of the technical scheme, the application has the following advantages: according to the application, the neural network training process is converted into the neural differential equation solving process, the optimal expansion factor is obtained by designing the automatic regulator model, and the circulation of the target neural differential equation is expanded in a dynamic expansion mode, so that the running speed advantage of the circulation full expansion is utilized, the compiling speed advantage of the circulation non-expansion is utilized, and the calculation speed can be effectively improved. Therefore, the application can be widely applied to the field of data processing.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Like parts are designated with like reference numerals throughout the drawings. In the drawings:

FIG. 1 is a conventional differential equation iterative loop solving process;

FIG. 2 is a flow chart of a data processing method based on cyclic dynamic expansion according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present application. It will be apparent that the described embodiments are some, but not all, embodiments of the application. All other embodiments, which are obtained by a person skilled in the art based on the described embodiments of the application, fall within the scope of protection of the application.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present application. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.

As shown in fig. 1, in general differential equation solution, continuous time is discretized, and the time is taken fromStarting, iterating stepwise until the end point +.>The method comprises the steps of carrying out a first treatment on the surface of the The iterative process typically uses loops, each step of which performs a specific calculation, and the calculation of the next step depends on the result of the current step, the specific calculation performed by each step typically being a neural network in the neural differential equation scenario.

In some embodiments of the present application, a data processing method based on cyclic dynamic expansion is provided, a target neural differential equation is constructed based on a target neural network model established by a user, and the cyclic expansion of the target neural differential equation is performed by using a dynamic expansion mode, so that the running speed advantage of cyclic full expansion is utilized, the compiling speed advantage of cyclic non-expansion is utilized, and the calculation speed can be effectively improved.

In accordance therewith, in other embodiments of the present application, a data processing system, apparatus, and medium based on cyclic dynamic unrolling are provided.

Example 1

As shown in fig. 2, the present embodiment provides a data processing method based on cyclic dynamic expansion, which includes the following steps:

1) Initializing a parameter value of a pre-established target neural network model, and constructing a target neural differential equation by taking the target neural network as a time step of a loop body;

2) Solving the constructed target neural differential equation by adopting a cyclic dynamic expansion algorithm, wherein the solving process is a training process of the target neural network, so as to obtain a trained target neural network;

3) And taking the current data to be processed as the input of the target neural network, and taking the output of the target neural network as the data processing result.

Preferably, in the step 1), the target neural network model is built, including but not limited to a fully connected neural network.

Preferably, in the step 2), the method adopts a cyclic dynamic expansion algorithm to solve the constructed target nerve differential equation, and comprises the following steps:

2.1 Extracting characteristics, compiling time and running time of a target nerve differential equation, and inputting a pre-trained automatic tuning model to obtain an optimal expansion factor;

2.2 And (3) training and solving the target neural differential equation by using the optimal expansion factor and using a cyclic dynamic expansion algorithm to obtain a trained target neural network.

Preferably, in the step 2.1), the method for training the autotune model includes the following steps:

2.1.1 Randomly generating and running different types of nerve differential equations offline, and collecting the compiling time and the running time of each nerve differential equation;

2.1.2 Extracting features of each nerve differential equation, and combining the feature extraction with the compiling time and the running time corresponding to each nerve differential equation to obtain a training sample set;

2.1.3 And (3) offline training the pre-established automatic tuning machine model by using a training sample set, and predicting the optimal expansion factor.

Preferably, in the step 2.1.1), when different types of neural differential equations are randomly generated and operated, the different configurations of the neural differential equations are changed, and specifically, the configurations of the neural differential equations at least include: number of hidden layers, width of hidden layers, batch size, differential equation dimensions, number of steps, expansion factors, etc.

Preferably, in the step 2.1.2), when feature extraction is performed on each neural differential equation, the extraction content includes:

(1) the performance characteristics mainly comprise the number of read bytes, the number of written bytes and the number of floating point operations in the calculation process of the neural network model, and the calculation intensity (the number of floating point operations is divided by the number of storage reading and writing);

(2) the neural network architecture features mainly comprise parameter quantity, hidden layer width and batch size;

(3) equation characteristics mainly comprise the dimension of differential equation and the step number;

(4) a spreading factor.

Preferably, in the step 2.1.3), the autotune model is obtained by training using a decision tree model. Specifically, the automatic tuning machine model comprises an input module, a compiling time estimating module, a running time estimating module and a model output module. The input module is used for acquiring the characteristics of the neural differential equation and sending the characteristics to the compiling time estimating module and the running time estimating module; the compiling time estimating module and the running time estimating module are used for predicting the compiling time and the running time according to the differential equation characteristics to obtain the compiling time and the running time; the model output module is used for obtaining the optimal unfolding factor according to the compiling time and the running time.

Further, in the step 2.2), the dynamic expansion is between full expansion and no expansion, and the core idea of the algorithm of dynamic expansion is: dividing the circulation steps into a plurality of blocks, wherein the number of the circulation steps in each block is an expansion factor, and the circulation steps in each circulation step in the blocks are fully expanded without expanding among the blocks. The dynamic expansion combines the advantages of full expansion and non-expansion, effectively controls the compiling time and the running time, and reduces the total time.

Specifically, based on the obtained training sample data, the training solution is performed on the target neural differential equation by using an optimal expansion factor and a cyclic dynamic expansion algorithm, so as to obtain a trained target neural network, which comprises the following steps:

2.2.1 Determining iteration parameters including a step function step_fn, a total number of steps num_steps, an initial state init_state and an optimal expansion factor unrell;

2.2.2 Determining to equally divide all the cyclic steps into a plurality of blocks num_blocks according to the total steps and the optimal spreading factor, and assigning an initial state init_state to the current state;

Example 2

In this embodiment, taking a differential equation common to physics as an example, the data processing method based on cyclic dynamic expansion provided by the application is further described.

Specifically, the target neural network constructed in the present embodiment isWhich is to predict a certain time +.>Physical state of->Wherein->Is a neural network parameter. Based on the target neural network, the following neural differential equation can be obtained:

（1）

after the continuous time is discretized, iteration can be performed by using a loop, the first step of the loop is calculated by using the following formula, and the other steps of the loop are similar, so that the solving process of the neural differential equation is the process of training the neural network.

（2）

（3）

Wherein, the liquid crystal display device comprises a liquid crystal display device,for the length of a single time step +.>Is a state value at a certain time.

The Gao Weipian differential equation can also be converted into a forward and backward neural stochastic differential equation (Forward Backward Neural Stochastic Differential Equation, FBNSDE) to solve the Hamilton-Jacobi-Bellman equation, and the equation is widely applied to the fields of robot control and the like.

Wherein, the liquid crystal display device comprises a liquid crystal display device,is a state of a certain time,/->Is a neural network, < >>Is->Is a derivative of (a). The forward and backward neural stochastic differential equation solving process also discretizes time to form a loop.

Example 3

This embodiment compares the present application with conventional block expansion algorithms and non-expansion algorithms.

Table 1 shows comparison of the velocity performance based on the velocity performance with no cycle spread. Dynamic expansion algorithms using optimal expansion factors offer significant performance advantages over non-expansion and full expansion.

TABLE 1 comparison of different expansion Algorithm Performance of neural ordinary differential equation

The GPU-based block expansion algorithm, the non-expansion algorithm, and the dynamic expansion algorithm adopted in this embodiment are described below, respectively.

1) The loop block unfolding algorithm running on the GPU is named as block_unrell, and comprises the following specific steps:

1. giving a step function step_fn, a block step number block_length and an initial state init_state;

2. assigning the initial state init_state to the current state;

3.for i in block_length:

4.output, state = step_fn(i, state);

5.return output, state;

2) The loop non-expansion algorithm running on the GPU is named as non_unit, and comprises the following specific steps:

1. given a step function step_fn, a condition function cond_fn and an initial state init_state;

2. assigning the initial state init_state to the current state;

3.do:

4. executing a cond_fn (state) on the GPU, and assigning the result to a variable cond_result_device on the GPU;

5. copying the variable cond_result_device on the GPU to a main memory variable cond_result_host, and waiting until the copying is finished;

6.if cond_result_host == false:

7.break;

8. step_fn (state) is executed on the GPU: state=step_fn (state);

9. while true;

10. returning to state;

3) The dynamic expansion algorithm named dynamic_unrell is:

1. given a step function step_fn, a total step number num_steps, an initial state init_state and a spreading factor unrell;

2. assigning the initial state init_state to the current state;

3. initializing i to 0;

4.remainder = num_steps % unroll;

5.num_blocks = (num_steps - remainder) / unroll;

6.do：

7.output, state = block_unroll(step_fn, unroll, state)；

8.i += 1;

9.while i < num_blocks;

10.output, state = block_unroll(step_fn, remainder, state);

11. returning output;

the 6-9 behavior non_unrell algorithm in 3), i.e., loop non-expansion algorithm, is described.

In order to facilitate users, the embodiment designs an automatic tuning device, trains an automatic tuning device model by collecting operation data of common nerve differential equations, and can predict reasonable expansion factors according to the nerve differential equations defined by users by using a search algorithm. Specifically, given a trained autotuner machine learning modelAnd->,/>Predicting compile time->Predicting the runtime, searching for the optimal spreading factor using a search algorithm, which is as follows.

1. Given the cause of expansionSub-solution spaceTrained machine learning model->And->；

2. Setting a search algorithm as a simulated annealing algorithm, setting the maximum iteration times max_iter, setting the current iteration times i=0, and setting a unfolding factor unit;

3.do；

4. from the solution space using a search algorithmA certain value is selected from->,/>;

5.;

6.;

7.;

8.i += 1;

9.while i < max_iter;

10. Returning the deployment factor;

Training total time on a certain neural differential equation using optimal expansion factorsThe expansion factor predicted by the automatic regulator is in a certain spiritTraining total time via differential equation>The end-to-end prediction accuracy of the autotuner is measured by the time offset: />Tests were performed on Titan RTX and Tesla V100, respectively. The results in Table 2 show that the predicted values of the autotuner of the present application deviate less than 8% in end-to-end task.

TABLE 2 end-to-end offset of the automatic regulator of the expansion factor in the neural ordinary differential equation

Example 4

In contrast to the above embodiment 1, which provides a data processing method based on cyclic dynamic expansion, this embodiment provides a data processing system based on cyclic dynamic expansion. The system provided in this embodiment may implement the data processing method based on cyclic dynamic expansion of embodiment 1, where the system may be implemented by software, hardware, or a combination of software and hardware. For example, the system may include integrated or separate functional modules or functional units to perform the corresponding steps in the methods of embodiment 1. Since the system of this embodiment is substantially similar to the method embodiment, the description of this embodiment is relatively simple, and the relevant points may be found in part in the description of embodiment 1, which is provided by way of illustration only.

The data processing system based on cyclic dynamic expansion provided in this embodiment includes:

Example 5

The present embodiment provides a processing device corresponding to the data processing method based on cyclic dynamic expansion provided in the present embodiment 1, where the processing device may be a processing device for a client, for example, a mobile phone, a notebook computer, a tablet computer, a desktop computer, or the like, to execute the method of embodiment 1.

The processing device comprises a processor, a memory, a communication interface and a bus, wherein the processor, the memory and the communication interface are connected through the bus so as to complete communication among each other. A computer program executable on the processor is stored in the memory, and when the processor executes the computer program, the data processing method based on cyclic dynamic expansion provided in this embodiment 1 is executed.

In some embodiments, the memory may be a high-speed random access memory (RAM: random Access Memory), and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

In other embodiments, the processor may be a Central Processing Unit (CPU), a Digital Signal Processor (DSP), or other general purpose processor, which is not limited herein.

Example 6

The data processing method based on loop dynamic expansion of this embodiment 1 may be embodied as a computer program product, which may include a computer readable storage medium having computer readable program instructions for executing the data processing method based on loop dynamic expansion of this embodiment 1.

The computer readable storage medium may be a tangible device that retains and stores instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any combination of the preceding.

Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present application and not for limiting the same, and although the present application has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: modifications and equivalents may be made to the specific embodiments of the application without departing from the spirit and scope of the application, which is intended to be covered by the claims.

Claims

1. The data processing method based on the cyclic dynamic expansion is characterized by comprising the following steps of:

2. The method for processing data based on cyclic dynamic expansion according to claim 1, wherein the step of solving the constructed target neural differential equation by using the cyclic dynamic expansion algorithm, namely, the training process of the target neural network, to obtain the trained target neural network comprises the steps of:

3. A data processing method based on cyclic dynamic expansion as claimed in claim 2, characterized in that the training of the autotuner model comprises:

4. A data processing method based on cyclic dynamic expansion as claimed in claim 3, characterized in that said randomly generating and running different types of neural differential equations is performed by changing different configurations of the neural differential equations, said configurations of the neural differential equations at least comprising: number of hidden layers, width of hidden layers, batch size, differential equation dimension, number of steps, and expansion factor.

5. A method for processing data based on cyclic dynamic expansion as claimed in claim 3, wherein, when extracting features of each neural differential equation, the extracting contents include:

a spreading factor.

6. A data processing method based on cyclic dynamic expansion as claimed in claim 3, wherein said autotune model is trained using a decision tree model;

7. The method for processing data based on cyclic dynamic expansion according to claim 1, wherein the step of solving the constructed target neural differential equation by using the cyclic dynamic expansion algorithm to obtain a trained target neural network comprises the steps of:

8. A data processing system based on cyclic dynamic expansion, comprising:

9. A computer readable storage medium storing one or more programs, wherein the one or more programs comprise instructions, which when executed by a computing device, cause the computing device to perform any of the methods of claims 1-7.

10. A computing device, comprising: one or more processors, memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 1-7.