CN113626038A

CN113626038A - Code conversion method, device, equipment and storage medium

Info

Publication number: CN113626038A
Application number: CN202110763532.0A
Authority: CN
Inventors: 肖熠; 霍志坤; 李志功
Original assignee: Dawning Information Industry Beijing Co Ltd
Current assignee: Dawning Information Industry Beijing Co Ltd
Priority date: 2021-07-06
Filing date: 2021-07-06
Publication date: 2021-11-09

Abstract

The application discloses a code conversion method, a device, equipment and a storage medium, wherein the method comprises the following steps: transcoding a kernel function of a CUDA Fortran language in codes to be converted to obtain a header file and a source file comprising the kernel function of a CUDA C language, wherein a main code in the codes to be converted comprises a first code and a second code, the second code comprises a calling code, constructing an encapsulation function according to the source file, the header file and the kernel function of the CUDA C language, the encapsulation function is used for calling the kernel function of the CUDA C language, modifying the second code to obtain a third code compatible with HIP, the third code comprises a code used for calling the encapsulation function, and finally obtaining the converted codes according to the modified main code, the source file, the header file and the encapsulation function. Therefore, code conversion is realized, and code conversion efficiency is improved.

Description

Code conversion method, device, equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for transcoding.

Background

In modern high-performance computers, a high-performance computer architecture with a heterogeneous accelerator as a main computing Unit gradually becomes a main structure, the heterogeneous accelerator can effectively provide high floating-point computing performance, the power consumption is low, and a Graphics Processing Unit (GPU) is a heterogeneous accelerator. In order to use Heterogeneous accelerators with structural differences, the industry currently proposes a migratable Heterogeneous parallel programming model (HIP) that can run on the CUDA and other Heterogeneous accelerators, and the Heterogeneous parallel programming model can explicitly or implicitly control the Heterogeneous accelerators, such as running computing tasks or copying data from the host side, and the HIP only provides C/C + + syntax and Interface support. The CUDA is a heterogeneous parallel programming model which can use NVIDIA GPU for general purpose computation.

Formula Translation (Formula Translation, Formula) language is a main language used in the field of scientific computing, while scientific computing is a main field using high-performance computing technology, in some fields of scientific computing, such as atmosphere and geology, the earliest scientific computing programs are usually developed by using the Formula language, and in order to realize that the scientific computing programs can be operated on NVIDIA GPU, the Formula Translation language is required to be used for rewriting the scientific computing programs according to the compiling requirement of CUDA, so that heterogeneous parallel programs (CUDA Formula codes for short) are obtained. Because the HIP only provides C/C + + syntax and interface support, if the CUDA Fortran code can be run on other heterogeneous accelerators to be realized, the CUDA Fortran code needs to be transplanted on the HIP, that is, the CUDA Fortran code is converted into a heterogeneous parallel program (HIP C code for short) written according to the writing requirement of the HIP by using the C language.

At present, the CUDA Fortran codes are converted into the HIP C codes through manual work, the requirements on personnel are high, more time is needed, and the code conversion efficiency is low.

Disclosure of Invention

The application provides a code conversion method, a device, equipment and a storage medium, which are used for solving the problem of low code conversion efficiency when converting a CUDA Fortran code into a HIP C code.

In a first aspect, the present application provides a transcoding method, including:

transcoding a kernel function of a CUDA Fortran language in codes to be converted to obtain a header file and a source file comprising the kernel function of a CUDA C language, wherein the codes to be converted comprise main codes and the kernel function of the CUDA Fortran language, the main codes comprise first codes and second codes, and the second codes comprise calling codes for calling the kernel function of the CUDA Fortran language;

constructing a packaging function according to the source file and the header file, wherein the packaging function is used for calling a kernel function of the CUDA C language;

modifying the second code to obtain a third code which is compatible with a migratable heterogeneous parallel programming model (HIP), wherein the third code comprises a code for calling the encapsulation function;

and obtaining converted codes according to the modified main codes, the source file, the header file and the encapsulation function, wherein the modified main codes comprise the first codes and the third codes.

Optionally, the second code further includes a calling code and a reference module of a runtime system interface, and the modifying the second code to obtain a third code compatible with the migratable heterogeneous parallel programming model HIP includes:

modifying the calling code of the runtime system interface into the calling code of the HIP compatible runtime system interface;

modifying the referencing module to the HIP referencing module;

modifying the calling code into the code for calling the encapsulation function;

and obtaining the third code according to the calling code of the HIP compatible runtime system interface, the HIP reference module and the code for calling the encapsulation function.

Another embodiment in the above application has the following advantages or benefits: and modifying the calling code, the reference module and the calling code of the system interface in the runtime respectively to obtain a third code which is compatible with the HIP.

Optionally, the encapsulation function includes a kernel function encapsulation module in C language and a kernel function calling module in Fortran language, where the kernel function encapsulation module in C language is used to perform a calling process on the kernel function in CUDA C language and encapsulate the calling process, and the kernel function calling module in Fortran language is used to construct a Fortran interface and call a function provided by the kernel function encapsulation module in C language.

Another embodiment in the above application has the following advantages or benefits: the kernel function of the CUDA C language is called indirectly by the code for calling the encapsulation function through the kernel function calling module of the Fortran language and the kernel function encapsulation module of the C language when the converted code is operated, so that the calling of the kernel function of the CUDA C language is realized, and the code conversion of the calling process of the kernel function is realized.

Optionally, the constructing a packaging function according to the source file and the header file includes:

constructing a kernel function packaging module of the C language according to the source file and the header file;

and constructing the kernel function calling module of the Fortran language according to the kernel function packaging module of the C language.

Another embodiment in the above application has the following advantages or benefits: the kernel function calling module of the Fortran language is used for constructing a Fortran interface and calling a function provided by the kernel function packaging module of the C language, so that the kernel function packaging module of the C language is constructed according to the source file and the header file, and then the kernel function calling module of the Fortran language is constructed according to the kernel function packaging module of the C language, so that the kernel function calling module of the Fortran language can conveniently call the function provided by the kernel function packaging module of the C language.

Optionally, the constructing the kernel function package module of the C language according to the source file and the header file includes:

extracting kernel function parameters from the kernel function of the CUDA C language according to the kernel function declaration in the header file;

according to the structure of the kernel function parameters, constructing a parameter declaration of the kernel function encapsulation module of the C language, wherein the parameter declaration of the kernel function encapsulation module of the C language comprises kernel function operation parameters and the kernel function parameters;

constructing the second calling code according to the parameter declaration of the kernel function packaging module of the C language, wherein the parameters of the second calling code are the same as the parameters included in the parameter declaration of the kernel function packaging module of the C language;

copying the code of the source file into the header file to obtain the target header file;

and obtaining the kernel function packaging module of the C language according to a preset header file corresponding to the HIP, the target header file, the parameter statement of the kernel function packaging module of the C language and the second calling code.

Another embodiment in the above application has the following advantages or benefits: the method comprises the steps of firstly sequentially constructing a parameter statement and a second calling code of a kernel function encapsulation module of the C language according to the structure of kernel function parameters, then copying a code of a source file into a header file to obtain a target header file, and further obtaining the kernel function encapsulation module of the C language according to the header file, the target header file corresponding to a preset HIP, the parameter statement of the kernel function encapsulation module of the C language and the second calling code, so that the construction of the kernel function encapsulation module of the C language is realized, and the kernel function encapsulation module of the C language is convenient to call the kernel function of the CUDA C language.

Optionally, the constructing a kernel function calling module of the Fortran language according to the kernel function encapsulation module of the C language includes:

constructing the first calling code according to the parameter declaration of the kernel function encapsulation module of the C language, wherein the parameter of the first calling code is the same as the parameter of the kernel function encapsulation module of the C language without the matrix dimension description information parameter;

constructing a calling function statement according to the parameters of the first calling code;

and obtaining the kernel function calling module of the Fortran language according to the calling function statement, a preset reference module and the first calling code.

Another embodiment in the above application has the following advantages or benefits: the method comprises the steps of sequentially constructing a first calling code and a calling function statement according to a parameter statement of a core function encapsulation module of the C language, and obtaining the core function calling module of the Fortran language according to the calling function statement, a preset reference module and the first calling code, so that the construction of the core function calling module of the Fortran language is realized, and the core function calling module of the Fortran language can conveniently call functions provided by the core function encapsulation module of the C language.

Optionally, the kernel function calling module of the Fortran language further includes a matrix dimension information description array, and the method further includes:

when the kernel function parameters are determined to comprise matrix dimension description information parameters and the format of the matrix dimension description information parameters is a preset format, constructing a matrix dimension information description array according to the matrix dimension description information parameters;

and adding the matrix dimension information description array into a kernel function calling module of the Fortran language.

Another embodiment in the above application has the following advantages or benefits: by determining that the kernel function parameters comprise matrix dimension description information parameters and the format of the matrix dimension description information parameters is a preset format, constructing a matrix dimension information description array according to the matrix dimension description information parameters and adding the matrix dimension information description array into the kernel function calling module of the Fortran language, the kernel function calling module of the Fortran language can perform correct address offset calculation and complete value taking operation by means of the matrix dimension description information array.

Optionally, the constructing a matrix dimension information description array according to the matrix dimension description information parameter includes:

acquiring matrix dimension description information during kernel function parameter transmission of the CUDA C language according to the matrix dimension description information parameters, wherein the matrix dimension description information comprises matrix dimensions and each dimension interval of the matrix;

constructing a matrix dimension information description array according to the matrix dimension and each dimension interval of the matrix, wherein the dimensions of the matrix dimension information description array comprise a first dimension and a second dimension, the first dimension is the same as the matrix dimension, and the second dimension is used for representing information of each dimension of the matrix, wherein the information of one dimension comprises: dimension interval starting value, dimension interval size, dimension interval ending value and dimension relative matrix starting address total offset value.

Another embodiment in the above application has the following advantages or benefits: matrix dimension description information during kernel function parameter transmission of the CUDA C language is obtained firstly, and a matrix dimension information description array is constructed according to matrix dimensions and each dimension interval of the matrix included by the matrix dimension description information, so that the construction of the matrix dimension information description array is realized, and a kernel function calling module of the Fortran language can perform correct address offset calculation and finish value taking operation by means of the matrix dimension description information array conveniently.

Optionally, the converted code is compiled as follows:

compiling the packaging function, the source file and the header file by using a hipcc compiler to generate a dynamic link library file;

and compiling the modified main code by calling a Fortran compiler to generate a compiled file, and linking the compiled file with the dynamic link library file.

Another embodiment in the above application has the following advantages or benefits: according to the compiling mode, the converted codes can run smoothly when running on the heterogeneous accelerator.

Optionally, before modifying the calling code of the runtime system interface into the calling code of the HIP-compatible runtime system interface, the method further includes:

determining that the HIP supports the calling code of the runtime system interface and the function of the kernel function of the CUDA Fortran language.

Another embodiment in the above application has the following advantages or benefits: when the calling code of the system interface at runtime is determined to support the functions of the calling code of the system interface at runtime and the kernel function of the CUDA Fortran language, the calling code of the system interface at runtime is modified into the calling code of the system interface at runtime which is compatible with the HIP, so that the code conversion efficiency can be further improved.

In a second aspect, the present application provides a transcoding apparatus, comprising:

the system comprises a transcoding module, a source file conversion module and a target conversion module, wherein the transcoding module is used for transcoding a kernel function of a CUDA Fortran language in a code to be converted to obtain a header file and a source file comprising the kernel function of a CUDA C language, the code to be converted comprises a main code and the kernel function of the CUDA Fortran language, the main code comprises a first code and a second code, and the second code comprises a calling code for calling the kernel function of the CUDA Fortran language;

a processing module to:

In a third aspect, the present application provides a transcoding device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the transcoding method of the first aspect or any of the possible implementations of the first aspect via execution of the executable instructions.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the transcoding method described in the first aspect or any of the possible implementation manners of the first aspect.

In a fifth aspect, the present application provides a computer program product, which includes a computer program that, when executed by a processor, implements the transcoding method described in the first aspect or any of the possible implementations of the first aspect.

According to the code conversion method, the code conversion device, the code conversion equipment and the code conversion storage medium, a source file and a header file of a kernel function of a CUDA Fortran language in codes to be converted are transcoded to obtain a source file and a header file of the kernel function of the CUDA C language, an encapsulation function for calling the kernel function of the CUDA C language is constructed according to the source file and the header file, a second code is modified into a third code which is compatible with HIP, the third code comprises a code for calling the encapsulation function, and the converted codes are obtained according to the modified main code, the source file, the header file and the encapsulation function. Therefore, the CUDA Fortran code is automatically converted into the HIP C code, the code conversion efficiency is improved, only the code conversion of the kernel function in the code to be converted and the code conversion of the calling process of the kernel function are carried out in the process of carrying out programming language conversion on the code to be converted, the first code in the code to be converted does not need to carry out programming language conversion, the integral conversion of the code to be converted can be avoided, the code conversion efficiency is higher, and the dependence on the C language can be reduced.

Drawings

Fig. 1 is a schematic view of an application scenario of a transcoding method provided in an embodiment of the present application;

fig. 2 is a flowchart of a transcoding method provided in an embodiment of the present application;

fig. 3 is a flowchart of a transcoding method provided in an embodiment of the present application;

fig. 4 is a flowchart of a transcoding method provided in an embodiment of the present application;

fig. 5 is a flowchart of a transcoding method provided in an embodiment of the present application;

FIG. 6 is a flowchart of a transcoding method provided in an embodiment of the present application;

fig. 7 is a flowchart of a transcoding method provided in an embodiment of the present application;

fig. 8 is a schematic structural diagram of a transcoding device according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a transcoding device according to an embodiment of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.

The terms "first" and "second," and the like in the description, the claims, and the drawings of the embodiments of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

First, some terms in the embodiments of the present application are explained below to facilitate understanding by those skilled in the art.

1. The heterogeneous accelerator is a GPU-like heterogeneous acceleration hardware, such as a Graphics Processing Unit (GPU), a Man Integrated Core (MIC), a Digital Signal Processing (DSP), a Field Programmable Gate Array (FPGA), and so on.

2. HIP, a heterogeneous programming model that can run on CUDA and other heterogeneous accelerators.

3. CUDA, a heterogeneous parallel programming model that can use NVIDIA GPU for general purpose computing.

4. CUDA Fortran, a Fortran extended programming model that can run on NVIDIA GPUs.

In order to solve the problems that when a CUDA Fortran code is converted into a HIP C code, the prior art is manually completed, the requirement on personnel is high, much time is required, and the code conversion efficiency is low, embodiments of the present application provide a code conversion method, apparatus, device, and storage medium, after obtaining a code to be converted (which is a CUDA Fortran code), the code to be converted includes a main code and a kernel function of a CUDA Fortran language, the main code includes a first code and a second code, the second code includes a call code for calling the kernel function of the CUDA Fortran language, a header file and a source file including the kernel function of the CUDA C language are obtained by transcoding the kernel function of the CUDA Fortran language, then the second code is modified to obtain a third code compatible with the HIP, the third code includes the modified call code, and then an encapsulation function is constructed according to the source file and the header file, the packaging function is an intermediate calling layer, the calling code is a code of Fortran language, and the calling code cannot directly call a kernel function of CUDA C language, so that the calling code is modified into a code capable of calling the packaging function, the modified calling code can indirectly call the kernel function of CUDA C language through the packaging function, and finally the converted code can be obtained according to the modified main code, the source file, the header file and the packaging function. Therefore, the CUDA Fortran code is automatically converted into the HIP C code, the code conversion efficiency is improved, only the code conversion of the kernel function and the kernel function calling process in the CUDA Fortran code is carried out in the conversion process of the CUDA Fortran code into the HIP C code, the first code in the CUDA Fortran code does not need to be converted into the programming language, the integral conversion of the CUDA Fortran code can be avoided, the code conversion efficiency is higher, and the dependence on the C language can be reduced. The technical solution provided by the present application is described in detail below with reference to the accompanying drawings.

Next, an application scenario according to an embodiment of the present application will be described as an example.

The code conversion method provided by the embodiment of the application can be at least applied to a scene that a scientific computing program developed by using the Fortran language runs on a heterogeneous accelerator except for the NVIDIA GPU, and particularly can be applied to transplanting the scientific computing program developed by using the Fortran language to the HIP in an efficient manner.

Illustratively, fig. 1 is a schematic view of an application scenario of a transcoding method provided in an embodiment of the present application, as shown in fig. 1, the present embodiment relates to a CUDA Fortran compiler 10 and a code conversion apparatus 20, during the transplanting process, the Fortran language is required to be firstly used for rewriting the scientific calculation program developed by the Fortran language according to the writing requirement of the CUDA to obtain a heterogeneous parallel program (CUDA Fortran code for short), the scientific computer program can be compiled into CUDA Fortran code for example by the CUDA Fortran compiler 10 shown in figure 1, then the CUDA Fortran code is input into the code conversion device 20, the code conversion device 20 performs code conversion on the CUDA Fortran code by using the code conversion method provided by the embodiment of the application to obtain a converted code, and obtaining the heterogeneous parallel program (called HIP C code for short) which is written by using C language according to the writing requirement of the HIP. Alternatively, the code conversion apparatus 20 may output the converted code. Thus, the scientific computing program developed by using the Fortran language is transplanted to the HIP in an efficient mode.

The following describes the technical solutions of the present application and how to solve the above technical problems with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.

Fig. 2 is a flowchart of a transcoding method provided in an embodiment of the present application, where the transcoding method may be performed by a transcoding device, and the transcoding device may be implemented by software and/or hardware. As shown in fig. 2, the method of this embodiment may include:

s101, transcoding a kernel function of a CUDA Fortran language in codes to be converted to obtain a header file and a source file comprising the kernel function of the CUDA C language, wherein the codes to be converted comprise main codes and the kernel function of the CUDA Fortran language, the main codes comprise first codes and second codes, and the second codes comprise calling codes for calling the kernel function of the CUDA Fortran language.

Specifically, the code to be converted is the code input to the code conversion apparatus, and the code to be converted may be the CUDA Fortran code shown above, the code to be converted includes a main code and a kernel function of the CUDA Fortran language, the main code includes a first code and a second code, and the second code includes a calling code for calling the kernel function of the CUDA Fortran language, where the first code is, for example, a code for performing management of a heterogeneous program and operating a heterogeneous accelerator.

In an implementable manner, transcoding the kernel function of the CUDA Fortran language in the code to be converted may be to call a CUDA Fortran compiler (e.g., pgfortran) to compile the kernel function of the CUDA Fortran language, and generate a source file and a header file, where file names of the source file and the header file end with ". gpu" and ". h" respectively. Optionally, the CUDA Fortran compiler and the transcoding apparatus of this embodiment may be located on the same device, for example, on the same heterogeneous accelerator, so that the transcoding apparatus can call the CUDA Fortran compiler conveniently.

S102, constructing a packaging function according to the source file and the header file, wherein the packaging function is used for calling a kernel function of the CUDA C language.

S103, modifying the second code to obtain a third code compatible with the HIP, wherein the third code comprises a modified calling code.

Optionally, the second code further includes a calling code and a reference module of the runtime system interface, and modifying the second code may include:

and S1031, modifying the calling code of the runtime system interface into the calling code of the HIP-compatible runtime system interface.

Optionally, before modifying the calling code of the runtime system interface into the calling code of the HIP-compatible runtime system interface, the method may further include:

if the HIP is determined to not support the calling code of the runtime system interface and the function of the kernel function of the CUDA Fortran language, unsupported indication information can be sent, or indication information used for indicating a user to modify the code to be converted can also be sent. The efficiency of transcoding can be further improved.

S1032, the reference module is modified into the HIP reference module.

S1033, modifying the calling code into a code for calling the encapsulation function.

S1034, obtaining a third code according to the calling code of the HIP compatible runtime system interface, the HIP reference module and the code for calling the encapsulation function.

And S104, obtaining the converted code according to the modified main code, the modified source file, the modified header file and the modified encapsulation function.

Specifically, the modified main code is obtained according to the first code and the third code, and the converted code comprises the modified main code, the source file, the header file and the encapsulation function. Optionally, the modified main code, the source file, the header file, and the encapsulation function may be packaged to obtain the converted code.

Specifically, since the code to be converted includes a main code and a kernel function of the CUDA Fortran language, the main code (including a first code and a second code, the second code includes a calling code of a runtime system interface, a reference module, and a calling code for calling the kernel function of the CUDA Fortran language) is a code written using the Fortran language, the kernel function of the CUDA Fortran language in the code to be converted is converted into the kernel function of the CUDA C language after being transcoded by S101, and the calling code written using the Fortran language cannot directly call the kernel function of the CUDA C language transcoded by S101, in this embodiment, an encapsulated function is built by first referring to a source file and a header file, the encapsulated function is used for calling the kernel function of the CUDA C language, the second code is modified to obtain a HIP-compatible third code, the third code includes a code for calling the encapsulated function, wherein the encapsulated function is built according to the kernel function of the CUDA C language (as an intermediate layer calling process), and modifying the calling code into a code for calling the encapsulation function, so that when the converted code is operated, the code for calling the encapsulation function can indirectly call the kernel function of the CUDA C language through the encapsulation function, namely the code for calling the encapsulation function calls the encapsulation function, and the encapsulation function calls the kernel function of the CUDA C language. Therefore, the calling of the kernel function of the CUDA C language is realized.

In the code conversion method provided by this embodiment, a source file and a header file including a kernel function in the CUDA Fortran language in a code to be converted are obtained by transcoding the kernel function in the CUDA Fortran language, a wrapper function for calling the kernel function in the CUDA C language is constructed according to the source file and the header file, a second code is modified into a third code compatible with HIP, the third code includes a code for calling the wrapper function, and a converted code is obtained according to the modified main code, the source file, the header file, and the wrapper function. Therefore, the CUDA Fortran code is automatically converted into the HIP C code, the code conversion efficiency is improved, only the code conversion of the kernel function in the code to be converted and the code conversion of the calling process of the kernel function are carried out in the process of carrying out programming language conversion on the code to be converted, the first code in the code to be converted does not need to carry out programming language conversion, the integral conversion of the code to be converted can be avoided, the code conversion efficiency is higher, and the dependence on the C language can be reduced.

On the basis of the above embodiment, as an implementable manner, the encapsulation function includes a kernel function encapsulation module in the C language and a kernel function calling module in the Fortran language, the kernel function encapsulation module in the C language is used for performing a calling process on a kernel function in the CUDA C language and encapsulating the calling process, and the kernel function calling module in the Fortran language is used for constructing a Fortran interface and calling a function provided by the kernel function encapsulation module in the C language.

In this embodiment, a kernel function encapsulation module of the C language and a kernel function calling module of the Fortran language are constructed according to a source file and a header file, a Fortran interface is constructed by the kernel function calling module of the Fortran language and a function provided by the kernel function encapsulation module of the C language is called, the kernel function encapsulation module of the C language calls a kernel function of the CUDA C language, and when a converted code is run, the code for calling the encapsulation function can indirectly call the kernel function of the CUDA C language through the kernel function calling module of the Fortran language and the kernel function encapsulation module of the C language, so that the kernel function of the CUDA C language is called, and code conversion of a kernel function calling process is realized.

Fig. 3 is a flowchart of a transcoding method provided in an embodiment of the present application, as shown in fig. 3, on the basis of the embodiment shown in fig. 2, the method of this embodiment is optional, and S102 may specifically include:

s1021, constructing a kernel function packaging module of the C language according to the source file and the header file.

S1022, constructing a kernel function calling module of Fortran language according to the kernel function packaging module of the C language.

In this embodiment, since the kernel function calling module of the Fortran language is used to construct the Fortran interface and call the function provided by the kernel function encapsulation module of the C language, the kernel function encapsulation module of the C language is constructed according to the source file and the header file, and then the kernel function calling module of the Fortran language is constructed according to the kernel function encapsulation module of the C language, so that the kernel function calling module of the Fortran language can call the function provided by the kernel function encapsulation module of the C language conveniently.

Fig. 4 is a flowchart of a transcoding method provided in an embodiment of the present application, and as shown in fig. 4, the method of this embodiment is based on the embodiment shown in fig. 3, and further, in S1021, building a kernel function encapsulation module in C language according to a source file and a header file, which may specifically be implemented by the following steps:

s201, extracting kernel function parameters from the kernel function of the CUDA C language according to the kernel function declaration in the header file.

Specifically, the header file (. h file) holds a kernel function declaration (i.e., C language declaration) of a kernel function, and the kernel function of the CUDA C language includes a header file reference, a kernel function parameter, and a kernel function body, and optionally, may further include a kernel function usage data structure definition. And extracting kernel function parameters from the kernel function of the CUDA C language according to the kernel function declaration in the header file.

S202, according to the structure of the kernel function parameters, constructing a parameter declaration of the kernel function encapsulation module of the C language, wherein the parameter declaration of the kernel function encapsulation module of the C language comprises kernel function operation parameters and kernel function parameters.

For example, one kernel function parameter is as follows:

extern"C"__global__void vectoradd_(

signed char*__restrict___pin,

signed char*_pout,int n__V_value，

int n__V_width,

signed char*__restrict___pin_sd,

signed char*__restrict___pout_sd)

the structure of the kernel function parameters comprises signed char __ restict ___ pin,

signed char*_pout,int n__V_value，

int n__V_width，

signed char*__restrict___pin_sd，

signed char*__restrict___pout_sd

according to the structure of the kernel function parameters, adding kernel function operation parameters, where the kernel function operation parameters include (four parameters, which are grid dimension (gridSize), block dimension (blockSize), dynamic Shared Memroy size (Shared memsize), and stream object (stream), and thus obtaining the parameter declaration of the C-language kernel function encapsulation module, for example, as follows:

void vectoradd_(

dim3 gridSize,

dim3 blockSize,

size_t sharedMemSize,

hipStream_t stream,

signed char*__restrict___pin,

signed char*_pout,

int n__V_value,

int n__V_width,

signed char*__restrict___pin_sd,

signed char*__restrict___pout_sd)

s203, constructing a second calling code according to the parameter declaration of the kernel function packaging module of the C language, wherein the parameter of the second calling code is the same as the parameter declaration of the kernel function packaging module of the C language.

Specifically, the parameter of the second calling code is the same as the parameter included in the parameter declaration of the core function encapsulation module in the C language, and the second calling code is constructed according to the parameter declaration of the core function encapsulation module in the C language and according to a preset format, for example, the second calling code constructed according to the parameter declaration of the core function encapsulation module in the C language is as follows:

vectoradd_<<<gridSize，blockSize,sharedMemSize，stream>>>

(_pin,_pout,n__V_value,n__V_width,_pin_sd,_pout_sd)

s204, copying the code of the source file into the header file to obtain a target header file.

For example, copying the code of the source file into a header file results in a target header file of "vectoradd.n001. h".

S205, obtaining the kernel function packaging module of the C language according to the preset header file corresponding to the HIP, the target header file, the parameter declaration of the kernel function packaging module of the C language and the second calling code.

Specifically, the C-language kernel function encapsulation module includes an encapsulation function source code file, where the encapsulation function source code file includes a header file corresponding to the HIP, a target header file, a parameter declaration of the C-language kernel function encapsulation module, and a second call code corresponding to a call process of the C-language kernel function encapsulation module calling the CUDA C language, where the target header file is a file obtained by copying a code of the source file into the header file.

Specifically, the codes are combined according to a preset sequence according to a header file corresponding to a preset HIP, a target header file, a parameter declaration of the kernel function encapsulation module in the C language, and a second calling code, so as to obtain the kernel function encapsulation module in the C language, for example, according to the header file corresponding to the preset HIP, the target header file obtained in S202-S204, the parameter declaration of the kernel function encapsulation module in the C language, and the second calling code, so as to obtain the kernel function encapsulation module in the C language as follows:

fig. 5 is a flowchart of a transcoding method provided in an embodiment of the present application, and as shown in fig. 5, on the basis of the embodiment shown in fig. 3, further, in S1022, a kernel function calling module in Fortran language is constructed according to a kernel function encapsulation module in C language, which may specifically be implemented by the following steps:

s301, according to the parameter declaration of the core function encapsulation module of the C language, a first calling code is constructed, and the parameter of the first calling code is the same as the parameter of the core function encapsulation module of the C language without the matrix dimension description information parameter.

Specifically, the parameters of the first calling code are the same as the parameters of the kernel function encapsulation module of the C language after the matrix dimension description information parameters are removed, taking the example of the parameter declaration of the kernel function encapsulation module in language C in the embodiment shown in fig. 4 as an example, the parameter declaration of the kernel function encapsulation module in language C includes parameters gridSize, blockSize, shared memsize, stream, _ pin, _ pout, n __ V _ value, n __ V _ width, _ pin _ sd, _ pout _ sd, wherein _ pin _ sd, _ pout _ sd is a matrix dimension description information parameter, and _ pin _ sd, _ pout _ sd is removed to obtain gridSize, blockSize, shared MemSze, stream, _ pin, _ pout, n __ V _ value, n __ V _ width, which is a parameter of the first calling code, when the first calling code is constructed, the first calling code is constructed according to a preset format of the first calling code, for example, the first calling code constructed according to the parameter declaration of the kernel function encapsulation module of the C language is as follows:

call vectorAdd(gridSize,blockSize,sharedMemSize,streamId,

in,out,value,width,in_sd,out_sd)

s302, constructing a calling function statement according to the parameters of the first calling code.

Specifically, the parameters of the first calling code are the same as the parameters included in the calling function declaration, for example, the calling function declaration constructed according to the parameter declaration of the first calling code is as follows:

subroutine vectorAdd(gridSize,blockSize,sharedMemSize,streamId,

in,out,value,width)

s303, obtaining the kernel function calling module of the Fortran language according to the calling function statement, the preset reference module and the first calling code.

Specifically, the kernel function calling module of the Fortran language includes a calling function statement, a reference module, and a first calling code corresponding to a calling procedure for calling a function provided by the kernel function encapsulation module of the C language, and optionally, may further include removing the definition of the Fortran language on the function type corresponding to the function name.

The preset reference module is, for example, "iso _ c _ binding," and for example, the kernel function calling module of the Fortran language is obtained by combining the preset reference module and the first calling code according to the calling function statement and the preset sequence as follows:

in the embodiment of the present application, when the kernel function parameter includes a matrix parameter with unknown dimension, the kernel function calling module in the Fortran language requires the caller to transmit related information. The unknown dimension matrix refers to a matrix variable which does not display declaration dimensions in the function region, and the display declaration dimensions refer to the interval of each dimension of a designated matrix displayed in a dimension syntax. The kernel function calling module of the Fortran language can perform correct address offset calculation and complete value taking operation only by means of the matrix dimension description information array. Therefore, when the kernel function parameters include the matrix dimension description information parameters, a matrix dimension information description array needs to be constructed according to the matrix dimension description information parameters and added into the kernel function calling module of the Fortran language, so that the kernel function calling module of the Fortran language can perform correct address offset calculation and complete value taking operation by means of the matrix dimension description information array.

Further, in an implementable manner, the kernel function calling module of the Fortran language may further describe an array according to matrix dimension information, and the method of this embodiment may further include, before S303:

s304, when the kernel function parameters comprise matrix dimension description information parameters and the format of the matrix dimension description information parameters is a preset format, a matrix dimension information description array is constructed according to the matrix dimension description information parameters.

Specifically, the kernel function parameters include an array parameter, a value parameter, and a matrix dimension description information parameter, and as in the above kernel function parameter example, the matrix dimension description information parameter is located behind the array parameter and the value parameter, and the matrix dimension description information parameter can be found according to the position setting of each parameter in the kernel function parameter.

For example, the matrix dimension descriptor parameters are signed char __ restict ___ pin sd, signed char __ restict ___ pout sd), and the format is prefix signed char. Then, a matrix dimension information description array is constructed according to the matrix dimension description information parameters, for example, the constructed matrix dimension information description array is generated matrix dimension information in _ sd and out _ sd.

S305, adding the matrix dimension information description array into a kernel function calling module of the Fortran language.

Specifically, the matrix dimension information description array is added to the above-mentioned exemplary kernel function calling module in the trace language, and the obtained kernel function calling module in the Fortran language is, for example:

in this embodiment, when it is determined that the kernel function parameters include matrix dimension description information parameters and the format of the matrix dimension description information parameters is a preset format, a matrix dimension information description array is constructed according to the matrix dimension description information parameters, and the matrix dimension information description array is added to the kernel function calling module of the Fortran language, so that the kernel function calling module of the Fortran language can perform correct address offset calculation and complete value taking operations by means of the matrix dimension description information array.

Fig. 6 is a flowchart of a transcoding method provided in an embodiment of the present application, as shown in fig. 6, based on the embodiment shown in fig. 5, the method in this embodiment further constructs a matrix dimension information description array according to matrix dimension description information parameters in S304, which may specifically include:

s3041, obtaining matrix dimension description information when the kernel function of the CUDA C language transmits parameters according to the matrix dimension description information parameters, wherein the matrix dimension description information comprises matrix dimensions and each dimension interval of the matrix.

S3042, constructing a matrix dimension information description array according to the matrix dimension and each dimension interval of the matrix, where the dimensions of the matrix dimension information description array include a first dimension and a second dimension, the first dimension is the same as the matrix dimension, and the second dimension is used to represent information of each dimension of the matrix, where the information of one dimension includes: dimension interval starting value, dimension interval size, dimension interval ending value and dimension relative matrix starting address total offset value.

Specifically, for example, if the matrix is referred to in ":" form, the start of the dimensional interval of the matrix is defined as 0. And calculating the size of the dimension interval according to each dimension interval of the matrix, wherein a dimension interval size value is defined as subtracting the initial value plus 1 from the end value of the dimension interval.

In the above embodiment, the converted code needs to be compiled and then run when running, and as an implementable manner, the converted code may be compiled as follows:

and compiling the packaging function, the source file and the header file by using a hipcc compiler to generate a dynamic link library file.

The codes are compiled according to the method, so that the converted codes can run smoothly.

The following describes a detailed process of the transcoding method provided in the present application with reference to a specific embodiment.

Fig. 7 is a flowchart of a transcoding method provided in an embodiment of the present application, where the transcoding method may be performed by a transcoding device, and the transcoding device may be implemented by software and/or hardware. As shown in fig. 7, the method of this embodiment may include:

s401, receiving codes to be converted.

Specifically, the code to be converted is code input to the code conversion device, the code to be converted comprises main code and a kernel function of the CUDA Fortran language, the main code comprises first code and second code, the second code comprises calling code of a runtime system interface, a reference module and calling code for calling the kernel function of the CUDA Fortran language, and the first code is code for managing a heterogeneous program and operating a heterogeneous accelerator.

The following main code is taken as an example for explanation:

main code:

in the above main code, the calling code of the runtime system interface is not shown, and what is shown is code other than the calling code of the runtime system interface, in this part of code, the black bold parts "use cudafor" and "call vectorrad < < < <1, n > > (a _ d, b _ d,3, n)" are respectively the calling code of the reference module and the kernel function for calling the CUDA Fortran language, and the rest is the first code.

S402, determining that the code to be converted is the code designed and realized by using CUDA Fortran or the code realized by using CUDA Fortran.

S403, according to whether the HIP supports the calling code of the system interface in operation and the kernel function of the CUDA Fortran language, corresponding operation is executed.

Specifically, a hipifiy-perl tool can be called to scan codes to be converted, the scan output result includes information such as calling codes of unsupported runtime system interfaces, unsupported kernel functions and unidentifiable kernel functions, and whether the HIP supports the calling codes of the runtime system interfaces and the kernel functions of the CUDA Fortran language can be determined according to the output result.

Specifically, if the HIP does not support the calling code of the runtime system interface and the kernel function of the CUDA Fortran language, the unsupported indication information is sent, or the indication information for indicating the user to modify the code to be converted can also be sent. The efficiency of transcoding can be further improved.

If the HIP supports the calling code of the runtime system interface and the kernel function of the CUDA Fortran language, the calling code of the runtime system interface is modified into the calling code of the runtime system interface which is compatible with the HIP, specifically, the calling code of the runtime system interface is modified by the HIP, or the calling code of the runtime system interface is subjected to format conversion of a data structure. For example, the calling code of the Runtime system Interface may be a running Programming Interface (API) provided by the CUDA, and when modifying, may be to modify a function name of the Runtime API provided by the CUDA, or may be to perform format conversion of a data structure on the Runtime API provided by the CUDA.

S404, transcoding the kernel function of the CUDA Fortran language in the code to be converted to obtain a header file and a source file comprising the kernel function of the CUDA C language.

The kernel function of the CUDA C language includes a header file reference, a kernel function parameter, and a kernel function body, and optionally, may further include a kernel function definition using a data structure.

Specifically, a CUDA Fortran compiler (e.g., pgfortran) may be invoked to compile a kernel function of the CUDA Fortran language, generating a source file and a header file, file names of which end at ". gpu" and ". h", respectively.

Exemplarily, the transcoding process of the kernel function of a CUDA Fortran language is given below, which is an example of a code to be converted (named: vectorrAdd. n001. cuf):

in the code to be converted, a kernel function of a CUDA Fortran language included in the code to be converted is "vectorrad", and when transcoding is performed on "vectorrad", the following instructions are used to generate transcoded program files vectorrad.n 001.gpu and vectorrad.n 001. h:

pgfortran-c-Mcuda＝keepgpu vectorAdd.cuf

wherein vectorrad.n 001.gpu is a source file, vectorrad.n 001.h is a header file, and the source file comprises the following contents:

wherein, # include "cuda _ runtime.

# include "vectoradd. n001.h" is a header file reference; extern "C" __ Global __ void vector _ (

signed char*__restrict___pin，

signed char*_pout,int n__V_value，

int n__V_width,

signed char*__restrict___pin_sd，

signed char __ restict ___ pout _ sd) as kernel function parameters;

the rest part is a kernel function body.

In particular, the source file vectorrad.n 001.gpu owns the C language implementation of the kernel function (vectorrad C) of the CUDA C language. In the transcoding process, a corresponding C language type identifier is automatically generated according to type definition in a kernel function of a CUDA Fortran language, wherein array parameters are defined in a signed character pointer declaration 'sign char', and the array parameters are the parameters of _ pin and _ pout shown in the kernel function parameter part; the value parameter is defined in immediate form, as the value parameters are the _ value and _ width parameters as shown in the kernel parameter section above; the kernel function of the CUDA C language obtained after transcoding adds matrix dimension description information parameters to the array parameters, the matrix dimension description information parameters are matrix dimension description information when the calling code calls the kernel function of the CUDA Fortran language, and the matrix dimension description information parameters are in the format of signed char _ start as shown in the kernel function parameter part.

Header file vectoradd.n001.h includes the following:

extern"C"__global__void vectoradd_(

signed char*__restrict___pin，

signed char*_pout,

int n__V_value，

int n__V_width,

signed char __ restore ___ pin sd corresponding to in matrix and out matrix respectively

signed char*__restrict___pout_sd)；

Wherein, vectorrAdd.n001. h has C language declaration of kernel function vectorrAdd C.

S405, constructing an encapsulation function according to the source file and the header file, wherein the encapsulation function comprises a kernel function encapsulation module of C language and a kernel function calling module of Fortran language.

The specific construction process can be seen in fig. 3 to 5, and is not described herein again.

When a kernel function calling module of the Fortran language is constructed, determining that kernel function parameters comprise matrix dimension description information parameters, and constructing a matrix dimension information description array according to the matrix dimension description information parameters when the format of the matrix dimension description information parameters is a preset format. For how to construct the matrix dimension information description array, reference may be made to the description of the embodiment shown in fig. 6, which is not described herein again.

S406, modifying the reference module into a HIP reference module, and modifying the calling code into a code for calling the encapsulation function.

For example, taking the main code shown above as an example, the reference module "use cudafor" is modified to "use hipfort", and the call code "call vector rad < <1, n > > (a _ d, b _ d,3, n)" is modified to "call vector rad (1, n,0,0, a _ d, b _ d,3, n)". The four parameters "1, n,0, 0" are kernel function operating parameters, and the last four parameters "a _ d, b _ d,3, n" are kernel function parameters.

S407, obtaining a third code according to a calling code of the HIP compatible runtime system interface, the HIP reference module and a code for calling the encapsulation function, and obtaining a modified main code according to the first code and the third code.

For example, taking the main code shown above as an example, except that the calling code of the runtime system interface is not shown, the modified main code is as follows:

s408, obtaining the converted code according to the modified main code, the source file, the header file and the packaging function.

In this embodiment, the converted code needs to be compiled as follows:

firstly, a hipcc compiler is used for compiling the packaging function, the source file and the header file to generate a dynamic link library file. For example, the specific instructions used during compilation are as follows, libkernel.

hipcc-fPIC--shared-o libkernel.so wrapper.cpp

And then, a Fortran compiler is called to compile the first code to generate a compiled file, and the compiled file is linked with the dynamic link library file.

In this embodiment, the type of the Fortran compiler is not limited, for example, taking the gfetran compiler as an example, instructions used in compiling are as follows:

gfortran-o test test.F90-lkernel

the following are embodiments of the apparatus of the present application that may be used to perform the above-described embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method described above in the present application.

Fig. 8 is a schematic structural diagram of a transcoding device provided in an embodiment of the present application, and as shown in fig. 8, the device of the present embodiment may include: a transcoding module 11 and a processing module 12, wherein,

the transcoding module 11 is configured to transcode a kernel function of a CUDA Fortran language in a code to be converted to obtain a header file and a source file including the kernel function of a CUDA C language, where the code to be converted includes a main code and the kernel function of the CUDA Fortran language, the main code includes a first code and a second code, and the second code includes a call code for calling the kernel function of the CUDA Fortran language;

a processing module 12 for:

modifying the second code to obtain a third code which is compatible with the migrating heterogeneous parallel programming model HIP, wherein the third code comprises a code for calling an encapsulation function;

and obtaining the converted code according to the modified main code, the source file, the header file and the encapsulation function, wherein the modified main code comprises a first code and a third code.

Optionally, the second code further includes a calling code and a reference module of the runtime system interface, and the processing module 12 is configured to modify the calling code of the runtime system interface into a calling code of the HIP-compatible runtime system interface;

modifying the reference module into a HIP reference module;

modifying the calling code into a code for calling the encapsulation function;

and obtaining a third code according to the calling code of the HIP compatible runtime system interface, the HIP reference module and the code for calling the packaging function.

Optionally, the encapsulation function includes a kernel function encapsulation module in the C language and a kernel function calling module in the Fortran language, where the kernel function encapsulation module in the C language is used to perform a calling process on a kernel function in the CUDA C language and encapsulate the calling process, and the kernel function calling module in the Fortran language is used to construct a Fortran interface and call a function provided by the kernel function encapsulation module in the C language.

Optionally, the Fortran language kernel function calling module includes a calling function statement, a reference module, and a first calling code corresponding to a calling process of the function provided by the Fortran language kernel function calling module calling the C language kernel function encapsulation module;

the C language kernel function encapsulation module comprises an encapsulation function source code file, the encapsulation function source code file comprises a header file corresponding to the HIP, a target header file, a parameter statement of the C language kernel function encapsulation module and a second calling code corresponding to a calling process of the C language kernel function encapsulation module calling the CUDA C language, and the target header file is a file obtained by copying a code of a source file into the header file.

Optionally, the processing module 12 is configured to:

and constructing a kernel function calling module of the Fortran language according to the kernel function packaging module of the C language.

Further, the processing module 12 is configured to: extracting kernel function parameters from the kernel function of the CUDA C language according to the kernel function declaration in the header file;

according to the structure of the kernel function parameters, constructing a parameter declaration of a kernel function encapsulation module of the C language, wherein the parameter declaration of the kernel function encapsulation module of the C language comprises kernel function operation parameters and kernel function parameters;

constructing a second calling code according to the parameter declaration of the kernel function encapsulation module of the C language, wherein the parameters of the second calling code are the same as the parameters included in the parameter declaration of the kernel function encapsulation module of the C language;

copying the code of the source file into a header file to obtain a target header file;

and obtaining the core function packaging module of the C language according to the preset header file corresponding to the HIP, the target header file, the parameter statement of the core function packaging module of the C language and the second calling code.

Further, the processing module 12 is configured to: constructing a first calling code according to the parameter declaration of the kernel function encapsulation module of the C language, wherein the parameter of the first calling code is the same as the parameter of the kernel function encapsulation module of the C language after the matrix dimension description information parameter is removed;

and obtaining the kernel function calling module of the Fortran language according to the calling function statement, the preset reference module and the first calling code.

Optionally, the kernel function calling module of the Fortran language further includes a matrix dimension information description array, and the processing module 12 is further configured to: when the kernel function parameters comprise matrix dimension description information parameters and the format of the matrix dimension description information parameters is a preset format, constructing a matrix dimension information description array according to the matrix dimension description information parameters;

Optionally, the processing module 12 is further configured to: acquiring matrix dimension description information during kernel function parameter transmission of a CUDA C language according to the matrix dimension description information parameters, wherein the matrix dimension description information comprises matrix dimensions and each dimension interval of the matrix;

constructing a matrix dimension information description array according to the matrix dimension and each dimension interval of the matrix, wherein the dimension of the matrix dimension information description array comprises a first dimension and a second dimension, the first dimension is the same as the matrix dimension, and the second dimension is used for representing the information of each dimension of the matrix, wherein the information of one dimension comprises: dimension interval starting value, dimension interval size, dimension interval ending value and dimension relative matrix starting address total offset value.

Optionally, the converted code is compiled as follows:

Optionally, the processing module 12 is further configured to: before the calling code of the runtime system interface is modified into the calling code of the HIP compatible runtime system interface, the HIP supporting the calling code of the runtime system interface and the function of the kernel function of the CUDA Fortran language are determined.

The apparatus provided in the embodiment of the present application may implement the method embodiment, and specific implementation principles and technical effects thereof may be referred to the method embodiment, which is not described herein again.

It should be noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the processing module may be a processing element separately set up, or may be implemented by being integrated in a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and a function of the processing module may be called and executed by a processing element of the apparatus. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element here may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.

For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when some of the above modules are implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor that can call program code. As another example, these modules may be integrated together, implemented in the form of a system-on-a-chip (SOC).

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the present application are all or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

Fig. 9 is a schematic structural diagram of a transcoding device provided in an embodiment of the present application, and as shown in fig. 9, the transcoding device of the present embodiment may include a processor 21 and a memory 22,

the memory 22 is used for storing executable instructions of the processor 21.

The processor 21 is configured to perform the transcoding method in the above-described method embodiments via execution of executable instructions.

Alternatively, the memory 22 may be separate or integrated with the processor 21.

When the memory 22 is a device independent of the processor 21, the transcoding apparatus of the present embodiment may further include:

a bus 23 for connecting the memory 22 and the processor 21.

Optionally, the transcoding device of this embodiment may further include: a communication interface 24, the communication interface 24 being connectable to the processor 21 via a bus 23.

The present application also provides a computer-readable storage medium having stored therein computer-executable instructions, which when run on a computer, cause the computer to perform the transcoding method as in the above embodiment.

Embodiments of the present application further provide a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program implements the transcoding method in the above embodiments.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A method of transcoding, comprising:

2. The method as claimed in claim 1, wherein the second code further comprises calling code and a reference module of a runtime system interface, and the modifying the second code to obtain a third code compatible with a migratable heterogeneous parallel programming model HIP comprises:

modifying the referencing module to the HIP referencing module;

3. The method according to claim 1, wherein the wrapper function includes a kernel function wrapper module in C language and a kernel function calling module in Fortran language, the kernel function wrapper module in C language is used for performing a calling procedure on the kernel function in CUDA C language and wrapping the calling procedure, and the kernel function calling module in Fortran language is used for constructing a Fortran interface and calling a function provided by the kernel function wrapper module in C language.

4. The method of claim 3, wherein constructing a wrapper function from the source file and the header file comprises:

5. The method according to claim 4, wherein said building a kernel function encapsulation module of said C language from said source files and said header files comprises:

6. The method according to claim 5, wherein the constructing the kernel function calling module of the Fortran language according to the kernel function encapsulation module of the C language comprises:

7. The method according to claim 6, wherein the kernel function calling module of the Fortran language further comprises a matrix dimension information description array, and the method further comprises:

8. The method of claim 7, wherein constructing a matrix dimension information description array according to the matrix dimension description information parameters comprises:

9. The method of any of claims 1-8, wherein the translated code is compiled as follows:

10. The method of claim 2, wherein prior to modifying the invocation code of the runtime system interface into invocation code of a HIP-compatible runtime system interface, the method further comprises:

11. A transcoding apparatus, comprising:

a processing module to:

12. A transcoding device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the transcoding method of any of claims 1-10 via execution of the executable instructions.

13. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the transcoding method of any one of claims 1 to 10.