WO2015032331A1 - Opencl program compilation method and compiler - Google Patents

Opencl program compilation method and compiler Download PDF

Info

Publication number
WO2015032331A1
WO2015032331A1 PCT/CN2014/085885 CN2014085885W WO2015032331A1 WO 2015032331 A1 WO2015032331 A1 WO 2015032331A1 CN 2014085885 W CN2014085885 W CN 2014085885W WO 2015032331 A1 WO2015032331 A1 WO 2015032331A1
Authority
WO
WIPO (PCT)
Prior art keywords
data transmission
mode
data
operation data
transmission mode
Prior art date
Application number
PCT/CN2014/085885
Other languages
French (fr)
Chinese (zh)
Inventor
刘颖
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2015032331A1 publication Critical patent/WO2015032331A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation

Definitions

  • the present application relates to the field of computer processing technologies, and more particularly to an OpenCL program compilation method and compiler.
  • OpenCL Open Computing Language
  • OpenCL Open Computing Language
  • the OpenCL program is mainly divided into two parts: the device program and the host program.
  • the device program when a heterogeneous system is composed of a CPU and a GPU, when the program running on the CPU is a host program, the program running on the GPU is a device program.
  • the execution process of the OpenCL program mainly includes: the host program controls the data transfer from the host end to the device end, the device end executes the device program to process the data, and the host program control transfers the processed result data from the device end to the host end.
  • the OpenCL program provides two data transfer modes, namely, a copy mode and a map mode.
  • the copy mode refers to copying data from the host memory to the device memory, or from the device memory to the host memory. Since the data needs to be copied and transferred in the system, the OpenCL program takes a long time in the data transfer phase in the copy mode, but When the device program is executed, because the data is already in the device memory, the device program execution phase takes a short time; the mapping mode means that during the data transfer phase, only the device memory to host memory mapping relationship is established, and the data is still located at the host. In the memory, the data transfer phase takes a short time, but when the device program is executed, it needs to access the data in the host memory, which causes the device execution phase to take a long time.
  • Taiwan and other features choose the appropriate data transfer mode to write OpenCL program, but the existing way, the user subjectivity is greater, and can not effectively guarantee the execution efficiency of OpenCL program.
  • the application provides an OpenCL program compilation method and a compiler to solve the technical problem that the efficiency of the OpenCL program cannot be effectively guaranteed in the prior art.
  • an open computing language OpenCL program compilation method including:
  • a compiled execution code file is generated in accordance with the compiled data transfer mode.
  • the calculating a program execution consumption time of the operation data in the first data transmission mode and the second data transmission mode respectively includes:
  • a second possible implementation manner of the first aspect is further provided.
  • the first data transmission mode is a replication mode
  • the second data transmission mode is a mapping mode
  • the verifying that the operation data is processed according to the second data transmission mode, and whether the operation data is secure comprises:
  • the verifying the operation data is processed according to the second data transmission mode, and whether the operation data is secure comprises:
  • a third possible implementation manner of the first aspect is further provided, where the first data transmission mode is a replication mode, When the second data transmission mode is the mapping mode; or, the first data transmission mode is a mapping mode, and the second data transmission mode is a replication mode;
  • the calculating the execution consumption time of the operation data in the first data transmission mode and the second data transmission mode respectively includes:
  • the data transfer time calculated in the mapping mode and the sum of device program executions are taken as the execution time of the operation data in the mapping mode.
  • a fourth possible implementation manner of the first aspect is further provided, where the total amount of memory access to the operation data is defined according to a source program file. The number of work items for the device program and the amount of memory access data for the unit work item are calculated.
  • a fifth possible implementation manner of the first aspect is further provided, where the data transmission rate, the memory access rate of the access device end, or the access host end
  • the memory access rate is predetermined based on the hardware characteristics of the current heterogeneous system execution hardware platform.
  • a compiler comprising:
  • a mode determining module configured to acquire a source program file of the OpenCL program, and determine a first data transmission mode of the operation data defined in the source program file;
  • the execution consumption time includes a data transmission time of the operation data and a device program execution time
  • a mode selection module configured to select a data transmission mode that consumes less time as a compiled data transmission mode of the operation data when the source program file is compiled.
  • a compiling module is configured to generate a compiled execution code file according to the compiled data transfer mode.
  • the method further includes:
  • a verification module configured to verify whether the operation data is safe when the operation data is processed according to the second data transmission mode, and if so, trigger the calculation module.
  • a second possible implementation manner of the second aspect is further provided, where the verification module is specifically configured to: when the first data transmission mode is a replication mode, The second data transmission mode is a mapping mode, and analyzes whether there is a write operation of the operation data by the host end during the execution of the program, and if not, determining that the operation data is secure; or, when the first data mode is In the mapping mode, when the second data transmission mode is the replication mode, it is analyzed whether there is a write operation of the operation data by the device end during the execution of the program, and if not, the data security is determined.
  • the first data transmission mode is a replication mode
  • the second data transmission mode is a mapping mode
  • the first The data transmission mode is a mapping mode
  • the second data transmission mode is a replication mode
  • the calculation module includes:
  • a first transmission time calculation module configured to calculate a data transmission time of the operation data in the replication mode according to the total data volume of the operation data and the data transmission rate;
  • a first execution time calculation module configured to calculate a device program execution time of the operation data in the copy mode according to a total amount of memory accesses to the operation data and a memory access rate of the access device end during execution of the device program;
  • a first consumption time calculation module configured to use a sum of a data transmission time calculated in the copy mode and a device program execution time as an execution consumption time of the operation data in the copy mode
  • a second transmission time calculation module configured to calculate and eliminate time according to a mapping relationship between the host end and the device end, and calculate a data transmission time of the operation data in the mapping mode
  • a second execution time calculation module configured to calculate a device program execution time of the operation data in the mapping mode according to a total amount of memory accesses to the operation data and a memory access rate of the access host during execution of the device program;
  • a second consumption time calculation module configured to use the data transmission time calculated in the mapping mode and the sum of device program executions as the execution time of the operation data in the mapping mode
  • the embodiment of the present application provides an OpenCL program compiling method and a compiler, and the compiler obtains a source program file of the OpenCL program, and determines a definition in the source program file. a first data transmission mode of the operation data; calculating an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively, selecting the data transmission mode in which the execution consumption time is small as the The compiled data transfer mode of the operation data when the source program file is compiled, and the compiled execution code file is generated according to the compiled data transfer mode.
  • the OpenCL program compiled according to the embodiment of the present application can reduce the program execution time, improve the program execution efficiency, and can effectively ensure the execution efficiency in different heterogeneous systems.
  • FIG. 1 is a flowchart of an embodiment of an OpenCL program compiling method according to an embodiment of the present application
  • FIG. 2 is a flowchart of another embodiment of an OpenCL program compiling method according to an embodiment of the present application
  • FIG. 3 is a flowchart of still another embodiment of an OpenCL program compiling method according to an embodiment of the present application
  • FIG. 4 is a schematic structural diagram of an embodiment of a compiler according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram of another embodiment of a compiler according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic structural diagram of a computing module in a compiler according to an embodiment of the present disclosure
  • FIG. 7 is a schematic structural diagram of an embodiment of a computing device according to an embodiment of the present application.
  • the OpenCL program compiled according to the embodiment of the present application can reduce the program execution time, improve the program execution efficiency, and effectively ensure the execution efficiency in different heterogeneous systems.
  • FIG. 1 is a flowchart of an embodiment of an open operation language OpenCL program compiling method according to an embodiment of the present application, which may include the following steps:
  • the OpenCL program is mainly divided into two parts: the host program and the device program (that is, the Kernel program).
  • the host program runs on the host side and the device program runs on the device side.
  • Host program control will operate on data Transfer from the host to the device, and transfer the data from the device to the host after processing the data.
  • the device program is executed by the device side to complete the processing of the operation data.
  • OpenCL program provides two data transfer modes, copy mode and map mode.
  • Copy mode refers to copying operational data from host-side memory to device-side memory, or from device-side memory to host-side memory.
  • the mapping mode means that only the mapping relationship between device memory and host memory is established during the data transmission phase, and the operation data still exists in the host side memory.
  • the data transmission phase may take a long time, and the device program execution phase takes a short time; if the data transmission mode of the operation data is the mapping mode, the data transmission phase may take a short time, the device The program execution phase takes a long time.
  • the replication mode is mainly applicable to application scenarios in which data is transmitted once and used multiple times.
  • the mapping mode is mainly applicable to application scenarios with large data transmission volume and small amount of access.
  • the source program file is written by the user, but because the user's subjectivity is large and the user experience is high, the execution according to the data transfer mode of the operation data defined in the source program file cannot effectively guarantee the execution of the program execution. effectiveness.
  • OpenCL programs are well ported and can be executed in different heterogeneous systems, OpenCL programs are more efficient to execute in one heterogeneous system, but perform efficiency in another heterogeneous system. Not necessarily high.
  • the inventor has changed the thinking mode in the process of implementing the present invention. Since the OpenCL program needs to be compiled during the execution process, the source program file is changed into a binary language that can be recognized by the computer. Therefore, the embodiment of the present application utilizes The compiler has improved the compilation process when compiling the source program files.
  • the operation data defined in the source program file and the first data transmission mode of the operation data are determined by analyzing the source program file.
  • the first data transmission mode may be a copy mode or a mapping mode, and is known by a function defined in the source program file. For example, when the operation function corresponding to the operation data is clWriteBuffer, it indicates that the data transmission mode of the operation data is a copy mode, when corresponding The operation function is clEnqueueMapBuffer, which indicates that the data transfer mode of the operation data is the mapping mode.
  • the source file of a section of OpenCL program includes:
  • the defined operation data is A
  • the data type is Double (double-precision floating-point type)
  • the data transmission mode is known as the copy mode according to the function clWriteBuffer.
  • the operation data defined in the source program file may include a plurality of processing operations performed by the embodiments of the present application for each operation data.
  • the second data transmission mode is a data transmission mode different from the first data transmission mode.
  • the second data transmission mode when the first data transmission mode is the replication mode, the second data transmission mode is a mapping mode; the first data transmission When the mode is the mapping mode, the second data transmission mode is the copy mode.
  • the execution consumption time of the operation data in the first data transmission mode and the second data transmission mode that is, the execution consumption time in the copy mode and the mapping mode, respectively, is calculated.
  • the execution consumption time includes a data transmission time of the operation data and a device program execution time.
  • the data transfer time is related to the total data amount of the operation data and the data transfer rate.
  • the total amount of data for this operational data can be known from the definition of operational data in the OpenCL source.
  • the data transmission rate can be predetermined in conjunction with the hardware characteristics of the execution platform of the current heterogeneous system.
  • the data transfer time includes the time when the operation data is transferred from the host side memory to the device side memory, and the time from the device side memory to the host side memory, The secondary transmission time is approximately the same. Therefore, the data transmission time can be equal to twice the product of the total data amount of the operation data and the data transmission rate.
  • the device program execution time is related to the total amount of memory access to the operation data and the memory access rate to the device-side memory during the execution of the device program.
  • the total amount of memory access to the operational data can be calculated according to the number of work items of the device program and the amount of memory access data of the unit work item.
  • the number of work items and the amount of memory access data per unit of work items can be passed to the source program file. The analysis was obtained.
  • the operational data is not actually transmitted between the host and the device, but is achieved by establishing a mapping relationship.
  • the data transmission time in the mapping mode is determined by the time when the mapping relationship is established, which includes the mapping relationship establishment time and the mapping relationship elimination time.
  • the mapping relationship elimination time is the same as the mapping relationship establishment time, so the data transmission time in the mapping mode can be equal to the mapping.
  • the relationship is established or eliminated twice as much.
  • the execution time of the device program is related to the total amount of memory access to the operation data during the execution of the device program and the memory access rate of the host memory.
  • the total amount of memory access to the operational data can be calculated based on the number of work items of the device program and the amount of memory access data per unit of work.
  • the access rate of the device program to the device side memory and the memory access rate to the host side can be determined in advance in conjunction with the hardware characteristics of the current heterogeneous execution platform.
  • the data transmission mode with less consumption time may be selected as the operation data when the source program file is compiled. Compiling the data transfer mode, so that when compiling, the corresponding compiled execution code file is generated according to the compiled data transfer mode, so that when the OpenCL program is executed, the operation data is transmitted and processed according to the selected compiled data transfer mode, which can be shortened. Execution time is spent and execution efficiency is improved.
  • the source program file is acquired, and the first data transmission mode of the operation data defined in the source program is determined; and then the execution of the operation data in the first data transmission mode and the second data transmission mode respectively is respectively calculated.
  • the selected compiled data transmission mode processes the operation data, shortens the execution time, can effectively improve the execution efficiency, and when the program is transplanted to another heterogeneous system, the embodiment of the present application is adopted.
  • the program can determine that the operational data conforms to the data transmission mode of the heterogeneous system, thereby effectively ensuring the execution efficiency of the program in different heterogeneous systems.
  • FIG. 2 is a flowchart of another embodiment of an open computing language OpenCL program compiling method according to an embodiment of the present application, which may include the following steps:
  • step 202 Verify that the operation data is safe according to the second data transmission mode, and if yes, execute step 203, and if no, end the process.
  • the second data transmission mode is different from the first data transmission mode.
  • the second data transmission mode is a mapping mode; when the first data transmission mode is the mapping mode, the first The second data transmission mode is the copy mode.
  • Whether the operation data is safe or not can be determined by judging whether the operation data is processed according to the second data transmission mode, and whether the operation of the program is erroneous, for example, whether the operation data is consistent on the host side and the device side.
  • the second data transmission mode is a mapping mode
  • the host side and the device side respectively save the operation data, and if the program side performs the write operation on the operation data, the operation data will be operated with the device side. Inconsistent data.
  • the operation data is processed according to the mapping mode, the operation data exists only with the host side, and during the execution of the program, the operation data processed by the device side is consistent with the operation data of the host side, This will result in different processing than the operation data in the copy mode, making the operation data unsafe and an error in program execution.
  • the verifying the operation data is processed according to the second data transmission mode, and whether the operation data is secure may be:
  • the operation data exists only with the host side. If the operation data is processed according to the copy mode, the operation data exists on both the host side and the device side. If the device side has a write operation on the operation data, the data of the device side will change, but the data of the host side will not be changed at the same time. This will result in inconsistent data between the device and the host. When the operation data is processed in the copy mode, it is not safe, and an error occurs in program execution.
  • the operation flow of the embodiment is continued only when it is determined that the operation data is processed in accordance with the second data transmission mode.
  • the data flow analysis technology can be used to analyze the definition and usage of the data in the OpenCL source program to determine whether there is a host end or a device end pair. Write operation of the operation data.
  • the first data transmission mode may be a replication mode, and the second data transmission mode may be a mapping mode; or the first data transmission mode may be a mapping mode, and the second data transmission mode may be a replication mode. .
  • the calculating the execution consumption time of the operation data in the first data transmission mode and the second data transmission mode may include:
  • the data transmission time of the operation data in the copy mode is calculated according to the total data amount of the operation data and the data transmission rate.
  • the data transmission rate may be represented by a unit data transmission consumption time, and the data transmission time may be equal to twice the total data amount of the operation data and the unit data transmission consumption time.
  • the device program execution time of the operation data is calculated according to the total amount of memory accesses to the operation data and the memory access rate of the access device end during execution of the device program.
  • the memory access rate of the access device may be determined according to the hardware characteristics of the current heterogeneous system program execution platform.
  • the total amount of memory access to the operational data may be equal to: the number of work items of the device program and the amount of memory access data per unit of work items.
  • the work item work-item is the smallest execution unit.
  • the number of work items indicates how many units the computer is divided into.
  • the amount of memory access data for each work item can be known according to the definition in the OpenCL source program. Analysis of flow analysis techniques.
  • the sum of the data transfer time calculated in the copy mode and the device program execution time is taken as the execution time of the operation data in the copy mode.
  • the time is established and eliminated according to the mapping relationship between the host end and the device end, and the data transmission time of the operation data in the mapping mode is calculated.
  • mapping relationship establishment and elimination time can be predetermined based on the hardware characteristics of the heterogeneous system execution platform.
  • the device program execution time of the operation data is calculated according to the total amount of memory access of the operation data and the memory access rate of the access host during execution of the device program.
  • the memory access rate of the access host may be predetermined according to the hardware characteristics of the heterogeneous system execution platform.
  • the total amount of memory access to the operational data may be equal to the product of the number of work items of the device program and the amount of memory access data per unit of work item.
  • the sum of the data transmission time calculated in the mapping mode and the device program execution time is taken as the execution time of the operation data in the mapping mode.
  • the compiled data transmission mode may be the first of the operation data.
  • the OpenCL program can be run on the machine, reducing execution time and improving execution efficiency.
  • the source file is obtained, and the first data transmission mode of the operation data defined in the source program is determined, and the operation data is verified. If it is processed according to the second data transmission mode, when it is secure, Calculating an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively, and selecting a data transmission mode in which the consumption time is small as a compiled data transmission mode of the operation data at the time of compiling, according to which Generate a compiled execution code file, so that when the program is running in the machine, the operation data can be processed according to the selected compiled data transmission mode, the execution time is shortened, the execution efficiency can be effectively improved, and the program is transplanted to another heterogeneous
  • the technical solution of the embodiment of the present application can be used to determine that the operation data conforms to the data transmission mode of the heterogeneous system, thereby ensuring the execution efficiency of the program in different heterogeneous systems.
  • FIG. 3 is a flowchart of another embodiment of an open computing language OpenCL program compiling method according to an embodiment of the present application.
  • the source of the following OpenCL program is used.
  • a fragment of the program file is an example:
  • the operation data defined in the source program file segment includes the operation data A and the operation data B, and the data transmission mode is the copy mode.
  • the following mainly introduces the operation data A as an example.
  • the processing procedure is similar to the operation data A, and will not be described again.
  • the method can include the following steps:
  • step 302 Verify whether the operation data is safe according to the mapping mode, and if yes, execute step 103, and if no, end the process.
  • the data transfer time Ct1 Vt*St*2 of the operation data A in the copy mode.
  • Vt is the total data amount of the operation data A. It can be known from the above program that the quantity type of the operation data A is a double-precision floating-point type, occupies 8 bytes, and the vector length of the operation data A is 65536, therefore, the operation data A The total amount of data is 65536*8B (bytes).
  • St is the unit data consumption time to indicate the data transmission rate. In this embodiment, it is assumed to be 4 cycles/B (4 clock cycles per byte).
  • the device program execution time Ca1 Va*Sab of the operation data A in the copy mode.
  • Va is the total amount of memory access of the device program to the operation data A
  • Nwi is 65536
  • the Sab refers to the unit data consumption time when the device program accesses the device-side memory, and is used to indicate the memory access rate to the device-side memory. Assume 4cycle/B.
  • the device program execution time Ca2 Va * Sam.
  • Va 128 KB.
  • the unit data consumes time, and the memory access rate to the host-side memory is assumed to be 16cycle/B.
  • mapping mode is the compiled data mode of the operation data A at compile time, so that when compiling, the data transmission mode of the operation data A will be operated. Make changes and generate a compiled execution code file corresponding to the mapping mode.
  • the operation data A is processed in accordance with the mapping mode, so that the execution processing time of the operation data A can be reduced, and the program execution efficiency is improved.
  • the operation data A can be determined according to the solution of the embodiment of the present application to conform to the data transmission mode of another heterogeneous system to ensure the program in the other heterogeneous system. effectiveness.
  • FIG. 4 is a schematic structural diagram of an embodiment of a compiler according to an embodiment of the present disclosure, where the compiler may include:
  • the mode determining module 401 is configured to acquire a source program file of the OpenCL program, and determine a first data transmission mode of the operation data defined in the source program file.
  • the first data transmission mode may be a copy mode or a mapping mode, as known by a function defined in the source program file.
  • the calculating module 402 is configured to calculate an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively.
  • the second data transmission mode is different from the first data transmission mode.
  • the second data transmission mode is the mapping mode; when the first data transmission mode is the mapping mode, the second data transmission mode is the copy mode.
  • the execution consumption time of the operation data in the first data transmission mode and the second data transmission mode that is, the execution consumption time in the copy mode and the mapping mode, respectively, is calculated.
  • the execution consumption time includes a data transmission time of the operation data and a device program execution time.
  • the mode selection module 403 is configured to select a data transmission mode with a small consumption time as a compiled data transmission mode of the operation data when the OpenCL source program is compiled.
  • the compiling module 404 is configured to generate a compiled execution code file according to the compiled data transmission mode.
  • the data transmission mode with less consumption time can be selected as the operation data of the OpenCL source program compiling time. Compiling the data transfer mode, so that when compiling, the corresponding compiled execution code file is generated according to the compiled data transfer mode, so that when the OpenCL program is executed, the operation data is transmitted and processed according to the selected compiled data transfer mode, and the execution can be shortened. It takes time to improve execution efficiency.
  • the compiler when the compiler acquires the source program file for compiling, first determines a first data transmission mode of the operation data defined in the source program; and then separately calculates the operation data in the first data transmission mode and the second data respectively.
  • the execution time in the transfer mode consumes time, and the data transfer mode in which the consumption time is small is selected as the compiled data transfer mode of the operation data at the time of compiling, whereby the compiled execution code file can be generated, so that when the program is run in the machine,
  • the operation data can be processed according to the selected compiled data transmission mode, the execution time is shortened, the execution efficiency can be effectively improved, and the program is transplanted to another heterogeneous system, and the technical solution of the embodiment of the present application can be used. It is determined that the operational data conforms to the data transmission mode of the heterogeneous system, thereby ensuring the execution efficiency of the program in different heterogeneous systems.
  • FIG. 5 is a schematic structural diagram of another embodiment of a compiler according to an embodiment of the present disclosure, where the compiler may include:
  • the mode determining module 501 is configured to acquire a source program file of the OpenCL program, and determine a first data transmission mode of the operation data defined in the source program file.
  • the verification module 501 is configured to verify whether the operation data is safe when the operation data is processed according to the second data transmission mode.
  • Whether the operation data is safe or not can be determined by judging whether the operation data is processed according to the second data transmission mode, and whether the operation of the program is erroneous, for example, whether the operation data is consistent on the host side and the device side.
  • the verification module may be specifically configured to: when the first data transmission mode is a replication mode, and when the second data transmission mode is a mapping mode, analyze whether a host side writes the operation data during a program execution process. If not, determining that the operation data is safe according to the second data transmission mode; when the first data mode is a mapping mode, and the second data transmission mode is a replication mode, analyzing is performed during program execution Whether there is a write operation of the operation data by the device end, and if not, determining that the operation data is safe according to the second data transmission mode.
  • the data stream analysis technology may be used to analyze the definition and usage of the data in the source program file to determine whether there is a host end or a device end pair operation. Data write operation.
  • the calculating module 502 is configured to calculate a program execution consumption time when the operation data is in the first data transmission mode and the second data transmission mode, respectively, when the verification module 501 verifies the operation data security.
  • the second data transmission mode is different from the first data transmission mode, and the execution consumption time includes a data transmission time of the operation data and a device program execution time.
  • the first data transmission mode may be a replication mode, and the second data transmission mode may be a mapping mode; or the first data transmission mode may be a mapping mode, and the second data transmission mode may be a replication mode. .
  • the computing module may specifically include:
  • the first transmission time calculation module 601 is configured to calculate a data transmission time of the operation data in the replication mode according to the total data amount of the operation data and the data transmission rate.
  • the data transmission rate may be represented by a unit data transmission consumption time, and the data transmission time may be equal to twice the total data amount of the operation data and the unit data transmission consumption time.
  • the first execution time calculation module 602 is configured to calculate a device program execution time of the operation data in the replication mode according to a total amount of memory accesses to the operation data and a memory access rate of the access device end during execution of the device program.
  • the total amount of memory accesses to the operation data is calculated according to the number of work items of the device program defined in the source program file and the amount of memory access data of the unit work item.
  • the memory access rate of the access device may be determined according to the hardware characteristics of the current heterogeneous system program execution platform.
  • the total amount of memory access to the operational data may be equal to the product of the number of work items of the device program and the amount of memory access data per unit of work item.
  • the work item work-item is the smallest execution unit.
  • the number of work items indicates how many units the computer is divided into.
  • the amount of memory access data of each work item can be known according to the definition in the source program file. Analysis of flow analysis techniques.
  • the first consumption time calculation module 603 is configured to use the sum of the data transmission time calculated in the copy mode and the device program execution time as the execution consumption time of the operation data in the copy mode.
  • the second transmission time calculation module 604 is configured to calculate and eliminate the time according to the mapping relationship between the host end and the device end, and calculate the data transmission time of the operation data in the mapping mode.
  • mapping relationship establishment and elimination time can be predetermined based on the hardware characteristics of the heterogeneous system execution platform.
  • the second execution time calculation module 605 is configured to calculate a device program execution time of the operation data in the mapping mode according to a total amount of memory accesses to the operation data and a memory access rate of the access host end during execution of the device program.
  • the memory access rate of the access host may be predetermined according to the hardware characteristics of the heterogeneous system execution platform.
  • the total amount of memory access to the operational data may be equal to the product of the number of work items of the device program and the amount of memory access data per unit of work item.
  • the second consumption time calculation module 606 is configured to use the sum of the data transmission time calculated in the mapping mode and the device program execution as the execution time of the operation data in the mapping mode.
  • the mode selection module 503 is configured to select a data transmission mode with a small consumption time as a compiled data transmission mode of the operation data when the OpenCL source program is compiled.
  • the compiling module 504 is configured to generate a compiled execution code file according to the compiled data transmission mode.
  • the compiled data transmission mode may be the first of the operation data.
  • the OpenCL program can be run on the machine, reducing execution time and improving execution efficiency.
  • the compiler obtains the source program file, and determines a first data transmission mode of the operation data defined in the source program file, and verifies the operation data. If it is processed according to the second data transmission mode, the security time is And calculating an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively, and selecting a data transmission mode in which the consumption time is small as a compiled data transmission mode of the operation data at the time of compiling, According to this, the compiled execution code file can be generated, so that when the program is running in the machine, the operation data can be processed according to the selected compiled data transmission mode, the execution time is shortened, the execution efficiency can be effectively improved, and the program can be transplanted to another When performing in a heterogeneous system, the technical solution of the embodiment of the present application can be used to determine that the operational data conforms to the data transmission mode of the heterogeneous system, thereby ensuring the execution efficiency of the program in different heterogeneous systems.
  • the compiler described in the foregoing embodiment is applied to the computing device in a practical application.
  • the computing device that deploys the compiler in the embodiment of the present application can implement the compilation of the source program file, and compile the source program file into a machine-recogable code.
  • the data transfer mode with low consumption time can be compiled for the operation data defined in the source program file, so that the execution time of the program is reduced and the program execution efficiency is improved.
  • the embodiment of the present application further provides a computing device, where the computing device includes at least a memory 701 and a processor 702 connected to the memory 701 via a bus.
  • the memory 701 stores a set of program instructions.
  • the memory 701 may be a high speed RAM memory or a non-volatile memory such as at least one disk memory or the like.
  • the processor 702 is configured to invoke a program instruction stored by the memory 701, and perform the following operations:
  • a compiled execution code file is generated in accordance with the compiled data transfer mode.
  • the processor 702 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention.
  • CPU central processing unit
  • ASIC Application Specific Integrated Circuit
  • the computing device can be used to execute any of the OpenCL program compilation methods shown in FIG. 1 to FIG. 2 provided by the embodiments of the present application.
  • the present application can be implemented by means of software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product in essence or in the form of a software product, which may be stored in a storage medium such as a ROM/RAM or a disk. , an optical disk, etc., includes instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform the methods described in various embodiments of the present application or portions of the embodiments.
  • a computer device which may be a personal computer, server, or network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

An Open Computing Language (OpenCL) program compilation method and compiler, said method comprising: obtaining OpenCL source files and determining a first data transmission mode for operation data defined in said source files; calculating the execution time of said operation data in said first data transmission mode and in second data transmission mode, said second data transmission mode being different from said first data transmission mode, and said execution time comprising the data transmission time of the operation data as well as the device program execution time; selecting the data transmission mode having a shorter execution time to serve as the compiled data transmission mode for said operation data when said source files are compiled; generating a compile execution code file according to said compiling data transmission mode. The invention effectively ensures program execution efficiency.

Description

OpenCL程序编译方法和编译器OpenCL program compilation method and compiler 技术领域Technical field
本申请涉及计算机处理技术领域,更具体的说是涉及一种OpenCL程序编译方法和编译器。The present application relates to the field of computer processing technologies, and more particularly to an OpenCL program compilation method and compiler.
背景技术Background technique
OpenCL(Open Computing Language,开放运算语言)是面向异构系统首个通用目的的并行编程开放式、免费标准语言,其为软件开发人员提供了统一的编程环境,以便于为高性能计算服务器、桌面计算系统、手持设备等编写高效轻便的代码。OpenCL (Open Computing Language) is the first general-purpose parallel programming open and free standard language for heterogeneous systems. It provides software developers with a unified programming environment for high-performance computing servers and desktops. Write efficient and lightweight code for computing systems, handheld devices, and more.
OpenCL程序主要分成两部分:设备程序和主机程序。例如一个异构系统由CPU和GPU组成时,在CPU上运行的程序为主机程序时,在GPU上运行的程序即为设备程序。OpenCL程序的执行过程主要包括:主机程序控制数据从主机端传输到设备端,设备端执行设备程序对数据进行处理,主机程序控制将处理结果数据从设备端传输到主机端。The OpenCL program is mainly divided into two parts: the device program and the host program. For example, when a heterogeneous system is composed of a CPU and a GPU, when the program running on the CPU is a host program, the program running on the GPU is a device program. The execution process of the OpenCL program mainly includes: the host program controls the data transfer from the host end to the device end, the device end executes the device program to process the data, and the host program control transfers the processed result data from the device end to the host end.
由上述OpenCL程序的执行过程可知,影响OpenCL程序执行效率的主要是数据传输阶段以及设备程序执行阶段,因此OpenCL程序提供了两种数据传输模式,即复制模式和映射模式。复制模式是指将数据从主机内存复制到设备内存,或者从设备内存复制到主机内存,由于数据需要在系统真正复制传输,因此在复制模式下,OpenCL程序在数据传输阶段耗时较长,但是在设备程序执行时,由于数据已经位于设备内存中,因此设备程序执行阶段耗时较短;映射模式是指在数据传输阶段,仅是建立设备内存到主机内存的映射关系,数据仍是位于主机内存中,因此数据传输阶段的耗时较短,但是设备程序执行时,需要访问主机内存中的数据,导致设备执行阶段耗时较长。According to the execution process of the above OpenCL program, the main efficiency of the OpenCL program is the data transfer phase and the device program execution phase. Therefore, the OpenCL program provides two data transfer modes, namely, a copy mode and a map mode. The copy mode refers to copying data from the host memory to the device memory, or from the device memory to the host memory. Since the data needs to be copied and transferred in the system, the OpenCL program takes a long time in the data transfer phase in the copy mode, but When the device program is executed, because the data is already in the device memory, the device program execution phase takes a short time; the mapping mode means that during the data transfer phase, only the device memory to host memory mapping relationship is established, and the data is still located at the host. In the memory, the data transfer phase takes a short time, but when the device program is executed, it needs to access the data in the host memory, which causes the device execution phase to take a long time.
发明人在实现本发明的过程中发现,现有技术中,为了保证OpenCL程序的执行效率,通常是由技术人员预先根据系统的不同应用场景,以及硬件平 台等特征,选用合适的数据传输模式编写OpenCL程序,但是现有的这种方式,用户主观性较大,并不能有效保证OpenCL程序的执行效率。In the process of implementing the present invention, the inventor has found that in the prior art, in order to ensure the execution efficiency of the OpenCL program, it is usually determined by the technician according to different application scenarios of the system, and the hardware is flat. Taiwan and other features, choose the appropriate data transfer mode to write OpenCL program, but the existing way, the user subjectivity is greater, and can not effectively guarantee the execution efficiency of OpenCL program.
发明内容Summary of the invention
本申请提供了一种OpenCL程序编译方法和编译器,用以解决现有技术中不能有效保证OpenCL程序执行效率的技术问题。The application provides an OpenCL program compilation method and a compiler to solve the technical problem that the efficiency of the OpenCL program cannot be effectively guaranteed in the prior art.
为实现上述目的,本申请提供如下技术方案:To achieve the above objective, the present application provides the following technical solutions:
第一方面,提供了一种开放运算语言OpenCL程序编译方法,包括:In a first aspect, an open computing language OpenCL program compilation method is provided, including:
获取OpenCL程序的源程序文件,并确定所述源程序文件中定义的操作数据的第一数据传输模式;Obtaining a source program file of the OpenCL program, and determining a first data transmission mode of the operation data defined in the source program file;
计算所述操作数据分别在所述第一数据传输模式和第二数据传输模式下的执行消耗时间,所述第二数据传输模式与所述第一数据传输模式不同,所述执行消耗时间包括所述操作数据的数据传输时间和设备程序执行时间;Calculating an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively, the second data transmission mode is different from the first data transmission mode, and the execution consumption time includes The data transmission time of the operation data and the execution time of the device program;
选择所述执行消耗时间较小的数据传输模式作为所述源程序文件编译时所述操作数据的编译数据传输模式;Selecting, by the execution, a data transmission mode with a small consumption time as a compiled data transmission mode of the operation data when the source program file is compiled;
按照所述编译数据传输模式生成编译执行代码文件。A compiled execution code file is generated in accordance with the compiled data transfer mode.
在所述第一方面的第一种可能实现方式中,所述计算所述操作数据分别在第一数据传输模式和第二数据传输模式下的程序执行消耗时间包括:In a first possible implementation manner of the first aspect, the calculating a program execution consumption time of the operation data in the first data transmission mode and the second data transmission mode respectively includes:
验证所述操作数据按照所述第二数据传输模式处理时,所述操作数据是否安全;Verifying whether the operation data is safe when the operation data is processed according to the second data transmission mode;
当所述操作数据安全时,计算所述操作数据分别在所述第一数据传输模式和所述第二数据传输模式下时的程序执行消耗时间。When the operation data is secure, calculating a program execution consumption time when the operation data is respectively in the first data transmission mode and the second data transmission mode.
结合所述第一方面的第一种可能实现方式,还提供了所述第一方面的第二种可能实现方式,当所述第一数据传输模式为复制模式,所述第二数据传输模式为映射模式,所述验证所述操作数据按照所述第二数据传输模式处理,所述操作数据是否安全包括:In conjunction with the first possible implementation of the first aspect, a second possible implementation manner of the first aspect is further provided. When the first data transmission mode is a replication mode, the second data transmission mode is a mapping mode, the verifying that the operation data is processed according to the second data transmission mode, and whether the operation data is secure comprises:
分析在程序执行过程中,是否存在主机端对所述操作数据的写操作,若否,确定所述操作数据按照所述第二数据传输模式处理时安全; Analyzing whether there is a write operation of the operation data by the host end during the execution of the program, and if not, determining that the operation data is safe according to the second data transmission mode;
当所述第一数据模式为映射模式,所述第二数据传输模式为复制模式时,所述验证所述操作数据按照所述第二数据传输模式处理,所述操作数据是否安全包括:When the first data mode is the mapping mode, and the second data transmission mode is the copy mode, the verifying the operation data is processed according to the second data transmission mode, and whether the operation data is secure comprises:
分析在程序执行过程中,是否存在设备端对所述操作数据的写操作,若否,确定所述操作数据按照所述第二数据传输模式处理时安全。It is analyzed whether there is a write operation of the operation data by the device end during the execution of the program, and if not, it is determined that the operation data is safe according to the second data transmission mode.
结合所述第一方面或所述第一方面的上述任一种可能实现方式,还提供了所述第一方面的第三种可能实现方式,所述第一数据传输模式为复制模式,所述第二数据传输模式为映射模式时;或,所述第一数据传输模式为映射模式,所述第二数据传输模式为复制模式;In conjunction with the first aspect or any one of the foregoing possible implementation manners of the first aspect, a third possible implementation manner of the first aspect is further provided, where the first data transmission mode is a replication mode, When the second data transmission mode is the mapping mode; or, the first data transmission mode is a mapping mode, and the second data transmission mode is a replication mode;
所述计算所述操作数据分别在第一数据传输模式和第二数据传输模式下的执行消耗时间包括:The calculating the execution consumption time of the operation data in the first data transmission mode and the second data transmission mode respectively includes:
根据所述操作数据的总数据量以及数据传输速率,计算复制模式下所述操作数据的数据传输时间;Calculating a data transmission time of the operation data in the copy mode according to the total data amount of the operation data and the data transmission rate;
根据设备程序执行过程中,对所述操作数据的内存访问总数据量以及访问设备端的内存访问速率,计算复制模式下所述操作数据的设备程序执行时间;Calculating a device program execution time of the operation data in the copy mode according to a total amount of memory accesses to the operation data and a memory access rate of the access device end during execution of the device program;
将所述复制模式下计算的数据传输时间以及设备程序执行时间之和,作为所述操作数据在复制模式下的执行消耗时间;And a sum of a data transmission time calculated in the copy mode and a device program execution time as an execution time of the operation data in the copy mode;
根据主机端与设备端的映射关系建立以及消除时间,计算映射模式下所述操作数据的数据传输时间;Calculating and eliminating the time according to the mapping relationship between the host end and the device end, and calculating the data transmission time of the operation data in the mapping mode;
根据设备程序执行过程中,对所述操作数据的内存访问总数据量以及访问主机端的内存访问速率,计算映射模式下所述操作数据的设备程序执行时间;Calculating a device program execution time of the operation data in the mapping mode according to a total amount of memory accesses to the operation data and a memory access rate of the access host in the execution of the device program;
将所述映射模式下计算的数据传输时间以及设备程序执行之和,作为所述操作数据在映射模式下的执行消耗时间。The data transfer time calculated in the mapping mode and the sum of device program executions are taken as the execution time of the operation data in the mapping mode.
结合所述第一方面的第三种可能实现方式,还提供了所述第一方面的第四种可能实现方式,所述对所述操作数据的内存访问总数据量为根据源程序文件中定义的设备程序的工作项数量以及单位工作项的内存访问数据量计算得到。 In conjunction with the third possible implementation of the first aspect, a fourth possible implementation manner of the first aspect is further provided, where the total amount of memory access to the operation data is defined according to a source program file. The number of work items for the device program and the amount of memory access data for the unit work item are calculated.
结合所述第一方面的第三种可能实现方式,还提供了所述第一方面的第五种可能实现方式,所述数据传输速率、所述访问设备端的内存访问速率或者所述访问主机端的内存访问速率是根据当前异构系统执行硬件平台的硬件特征预先确定的。In conjunction with the third possible implementation of the first aspect, a fifth possible implementation manner of the first aspect is further provided, where the data transmission rate, the memory access rate of the access device end, or the access host end The memory access rate is predetermined based on the hardware characteristics of the current heterogeneous system execution hardware platform.
第二方面,提供了一种编译器,包括:In a second aspect, a compiler is provided, comprising:
模式确定模块,用于获取OpenCL程序的源程序文件,并确定所述源程序文件中定义的操作数据的第一数据传输模式;a mode determining module, configured to acquire a source program file of the OpenCL program, and determine a first data transmission mode of the operation data defined in the source program file;
计算模块,用于计算所述操作数据分别在所述第一数据传输模式和第二数据传输模式下的执行消耗时间,所述第二数据传输模式与所述第一数据传输模式不同,所述执行消耗时间包括所述操作数据的数据传输时间和设备程序执行时间;a calculation module, configured to calculate an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, where the second data transmission mode is different from the first data transmission mode, The execution consumption time includes a data transmission time of the operation data and a device program execution time;
模式选择模块,用于选择消耗时间较小的数据传输模式作为所述源程序文件编译时所述操作数据的编译数据传输模式。And a mode selection module, configured to select a data transmission mode that consumes less time as a compiled data transmission mode of the operation data when the source program file is compiled.
编译模块,用于按照所述编译数据传输模式生成编译执行代码文件。A compiling module is configured to generate a compiled execution code file according to the compiled data transfer mode.
在所述第二方面的第一种可能实现方式中,还包括:In the first possible implementation manner of the second aspect, the method further includes:
验证模块,用于验证所述操作数据按照第二数据传输模式处理时,所述操作数据是否安全,若是,再触发所述计算模块。And a verification module, configured to verify whether the operation data is safe when the operation data is processed according to the second data transmission mode, and if so, trigger the calculation module.
结合所述第二方面的第一种可能实现方式,还提供了所述第二方面的第二种可能实现方式,所述验证模块具体用于当所述第一数据传输模式为复制模式,所述第二数据传输模式为映射模式,分析在程序执行过程中,是否存在主机端对所述操作数据的写操作,若否,确定所述操作数据安全;或者,当所述第一数据模式为映射模式,所述第二数据传输模式为复制模式时,分析在程序执行过程中,是否存在设备端对所述操作数据的写操作,若否,确定所述数据安全。In conjunction with the first possible implementation of the second aspect, a second possible implementation manner of the second aspect is further provided, where the verification module is specifically configured to: when the first data transmission mode is a replication mode, The second data transmission mode is a mapping mode, and analyzes whether there is a write operation of the operation data by the host end during the execution of the program, and if not, determining that the operation data is secure; or, when the first data mode is In the mapping mode, when the second data transmission mode is the replication mode, it is analyzed whether there is a write operation of the operation data by the device end during the execution of the program, and if not, the data security is determined.
结合所述第二方面或所述第二方面的上述任一种可能实现方式,所述第一数据传输模式为复制模式,所述第二数据传输模式为映射模式时;或,所述第一数据传输模式为映射模式,所述第二数据传输模式为复制模式;With reference to the second aspect, or any one of the foregoing possible implementation manners, the first data transmission mode is a replication mode, and the second data transmission mode is a mapping mode; or, the first The data transmission mode is a mapping mode, and the second data transmission mode is a replication mode;
所述计算模块包括:The calculation module includes:
第一传输时间计算模块,用于根据所述操作数据的总数据量以及数据传输速率,计算复制模式下所述操作数据的数据传输时间; a first transmission time calculation module, configured to calculate a data transmission time of the operation data in the replication mode according to the total data volume of the operation data and the data transmission rate;
第一执行时间计算模块,用于根据设备程序执行过程中,对所述操作数据的内存访问总数据量以及访问设备端的内存访问速率,计算复制模式下所述操作数据的设备程序执行时间;a first execution time calculation module, configured to calculate a device program execution time of the operation data in the copy mode according to a total amount of memory accesses to the operation data and a memory access rate of the access device end during execution of the device program;
第一消耗时间计算模块,用于将将所述复制模式下计算的数据传输时间以及设备程序执行时间之和,作为所述操作数据在复制模式下的执行消耗时间;a first consumption time calculation module, configured to use a sum of a data transmission time calculated in the copy mode and a device program execution time as an execution consumption time of the operation data in the copy mode;
第二传输时间计算模块,用于根据主机端与设备端的映射关系建立以及消除时间,计算映射模式下所述操作数据的数据传输时间;a second transmission time calculation module, configured to calculate and eliminate time according to a mapping relationship between the host end and the device end, and calculate a data transmission time of the operation data in the mapping mode;
第二执行时间计算模块,用于根据设备程序执行过程中,对所述操作数据的内存访问总数据量以及访问主机端的内存访问速率,计算映射模式下所述操作数据的设备程序执行时间;a second execution time calculation module, configured to calculate a device program execution time of the operation data in the mapping mode according to a total amount of memory accesses to the operation data and a memory access rate of the access host during execution of the device program;
第二消耗时间计算模块,用于将所述映射模式下计算的数据传输时间以及设备程序执行之和,作为所述操作数据在映射模式下的执行消耗时间a second consumption time calculation module, configured to use the data transmission time calculated in the mapping mode and the sum of device program executions as the execution time of the operation data in the mapping mode
经由上述的技术方案可知,与现有技术相比,本申请实施例提供了一种OpenCL程序编译方法和编译器,编译器获取OpenCL程序的源程序文件,并确定所述源程序文件中定义的操作数据的第一数据传输模式;计算所述操作数据分别在所述第一数据传输模式和第二数据传输模式下的执行消耗时间,选择所述执行消耗时间较小的数据传输模式作为所述源程序文件编译时所述操作数据的编译数据传输模式,并按照所述编译数据传输模式生成编译执行代码文件。按照本申请实施例编译后的OpenCL程序可以减小程序执行消耗时间,提高了程序执行效率,可以有效保证在不同异构系统中的执行效率。According to the foregoing technical solution, the embodiment of the present application provides an OpenCL program compiling method and a compiler, and the compiler obtains a source program file of the OpenCL program, and determines a definition in the source program file. a first data transmission mode of the operation data; calculating an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively, selecting the data transmission mode in which the execution consumption time is small as the The compiled data transfer mode of the operation data when the source program file is compiled, and the compiled execution code file is generated according to the compiled data transfer mode. The OpenCL program compiled according to the embodiment of the present application can reduce the program execution time, improve the program execution efficiency, and can effectively ensure the execution efficiency in different heterogeneous systems.
附图说明DRAWINGS
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings to be used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description are only It is an embodiment of the present application, and those skilled in the art can obtain other drawings according to the provided drawings without any creative work.
图1为本申请实施例提供的一种OpenCL程序编译方法一个实施例的流程图; FIG. 1 is a flowchart of an embodiment of an OpenCL program compiling method according to an embodiment of the present application;
图2为本申请实施例提供的一种OpenCL程序编译方法另一个实施例的流程图;2 is a flowchart of another embodiment of an OpenCL program compiling method according to an embodiment of the present application;
图3为本申请实施例提供的一种OpenCL程序编译方法又一个实施例的流程图;FIG. 3 is a flowchart of still another embodiment of an OpenCL program compiling method according to an embodiment of the present application;
图4为本申请实施例提供的一种编译器一个实施例的结构示意图;FIG. 4 is a schematic structural diagram of an embodiment of a compiler according to an embodiment of the present disclosure;
图5为本申请实施例提供的一种编译器另一个实施例的结构示意图;FIG. 5 is a schematic structural diagram of another embodiment of a compiler according to an embodiment of the present disclosure;
图6为本申请实施例提供的编译器中计算模块的一种结构示意图;FIG. 6 is a schematic structural diagram of a computing module in a compiler according to an embodiment of the present disclosure;
图7为本申请实施例提供的一种计算设备一个实施例的结构示意图。FIG. 7 is a schematic structural diagram of an embodiment of a computing device according to an embodiment of the present application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application are clearly and completely described in the following with reference to the drawings in the embodiments of the present application. It is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope are the scope of the present application.
本申请实施例的主要思想之一包括:One of the main ideas of the embodiments of the present application includes:
编译器获取OpenCL程序的源程序文件,并确定所述源程序文件中定义的操作数据的第一数据传输模式;计算所述操作数据分别在所述第一数据传输模式和第二数据传输模式下的执行消耗时间,选择所述执行消耗时间较小的数据传输模式作为所述源程序文件编译时所述操作数据的编译数据传输模式,并按照所述编译数据传输模式生成编译执行代码文件。按照本申请实施例编译后的OpenCL程序可以减小程序执行消耗时间,提高了程序执行效率,还可以有效保证在不同异构系统中的执行效率。Compiling a source file of the OpenCL program, and determining a first data transmission mode of the operation data defined in the source program file; calculating the operation data in the first data transmission mode and the second data transmission mode, respectively Execution consumes time, selects the data transfer mode with less execution time as the compiled data transfer mode of the operation data when the source program file is compiled, and generates a compiled execution code file according to the compiled data transfer mode. The OpenCL program compiled according to the embodiment of the present application can reduce the program execution time, improve the program execution efficiency, and effectively ensure the execution efficiency in different heterogeneous systems.
图1为本申请实施例一种开放运算语言OpenCL程序编译方法一个实施例的流程图,可以包括以下几个步骤:FIG. 1 is a flowchart of an embodiment of an open operation language OpenCL program compiling method according to an embodiment of the present application, which may include the following steps:
101:获取OpenCL(Open Computing Language,开放运算语言)程序的源程序文件,并确定所述源程序文件中定义的操作数据的第一数据传输模式。101: Obtain a source program file of an OpenCL (Open Computing Language) program, and determine a first data transmission mode of the operation data defined in the source program file.
OpenCL程序主要分成两部分:主机程序和设备程序(即Kernel程序)。主机程序运行在主机端,设备程序运行在设备端。主机程序控制将操作数据 从主机端传输至设备端,以及操作数据处理后从设备端传输至主机端。设备程序由设备端执行,完成对操作数据的处理。The OpenCL program is mainly divided into two parts: the host program and the device program (that is, the Kernel program). The host program runs on the host side and the device program runs on the device side. Host program control will operate on data Transfer from the host to the device, and transfer the data from the device to the host after processing the data. The device program is executed by the device side to complete the processing of the operation data.
影响OpenCL程序执行效率的主要是数据传输阶段以及设备程序执行阶段。OpenCL程序提供了两种数据传输模式,即复制模式和映射模式。The main factors affecting the execution efficiency of OpenCL programs are the data transfer phase and the device program execution phase. The OpenCL program provides two data transfer modes, copy mode and map mode.
复制模式是指将操作数据从主机端内存复制到设备端内存,或者从设备端内存复制到主机端内存。Copy mode refers to copying operational data from host-side memory to device-side memory, or from device-side memory to host-side memory.
映射模式是指在数据传输阶段,仅建立设备内存到主机内存的映射关系,操作数据仍存在于主机端内存。The mapping mode means that only the mapping relationship between device memory and host memory is established during the data transmission phase, and the operation data still exists in the host side memory.
若操作数据的数据传输模式为复制模式,数据传输阶段可能耗时较长,设备程序执行阶段耗时较短;若操作数据的数据传输模式为映射模式,数据传输阶段可能耗时较短,设备程序执行阶段耗时较长。If the data transmission mode of the operation data is the copy mode, the data transmission phase may take a long time, and the device program execution phase takes a short time; if the data transmission mode of the operation data is the mapping mode, the data transmission phase may take a short time, the device The program execution phase takes a long time.
因此复制模式主要适用于数据一次传输、多次使用的应用场景;映射模式主要适用于数据传输量较大、访问量较小的应用场景,Therefore, the replication mode is mainly applicable to application scenarios in which data is transmitted once and used multiple times. The mapping mode is mainly applicable to application scenarios with large data transmission volume and small amount of access.
源程序文件是由用户编写,但是由于用户的主观性较大,且对用户经验要求较高,因此按照源程序文件中定义的操作数据的数据传输模式执行,将无法有效保证程序执行时的执行效率。The source program file is written by the user, but because the user's subjectivity is large and the user experience is high, the execution according to the data transfer mode of the operation data defined in the source program file cannot effectively guarantee the execution of the program execution. effectiveness.
且发明人在研究中还进一步发现,由于OpenCL程序移植性较好,可以在不同异构系统执行,但是OpenCL程序在一个异构系统中执行效率较高,但在另一个异构系统中执行效率不一定高。Furthermore, the inventors have further discovered in the research that since OpenCL programs are well ported and can be executed in different heterogeneous systems, OpenCL programs are more efficient to execute in one heterogeneous system, but perform efficiency in another heterogeneous system. Not necessarily high.
因此,发明人在实现本发明的过程中,转变了思维模式,由于OpenCL程序在执行过程中,需要进行编译,将源程序文件变成计算机可以识别的二进制语言,因此本申请实施例,即利用编译器对源程序文件进行编译时,对编译过程进行了改进。Therefore, the inventor has changed the thinking mode in the process of implementing the present invention. Since the OpenCL program needs to be compiled during the execution process, the source program file is changed into a binary language that can be recognized by the computer. Therefore, the embodiment of the present application utilizes The compiler has improved the compilation process when compiling the source program files.
当编译器获取到源程序文件后,通过对源程序文件的分析,确定出该源程序文件中定义的操作数据,以及该操作数据的第一数据传输模式。After the compiler obtains the source program file, the operation data defined in the source program file and the first data transmission mode of the operation data are determined by analyzing the source program file.
该第一数据传输模式可以是复制模式或者映射模式,由源程序文件中定义的函数可知,例如,当操作数据对应的操作函数为clWriteBuffer,则表明操作数据的数据传输模式为复制模式,当对应的操作函数为clEnqueueMapBuffer,则表明操作数据的数据传输模式为映射模式。The first data transmission mode may be a copy mode or a mapping mode, and is known by a function defined in the source program file. For example, when the operation function corresponding to the operation data is clWriteBuffer, it indicates that the data transmission mode of the operation data is a copy mode, when corresponding The operation function is clEnqueueMapBuffer, which indicates that the data transfer mode of the operation data is the mapping mode.
例如一段OpenCL程序的源程序文件中包括: For example, the source file of a section of OpenCL program includes:
Double h_A[65536]=……;Double h_A[65536]=...;
clWriteBuffer(d_A,65536*8,h_A[0],……);clWriteBuffer(d_A,65536*8,h_A[0],...);
可知,定义的操作数据为A,数据类型为Double(双精度浮点型),其数据传输模式根据函数clWriteBuffer可知为复制模式。It can be seen that the defined operation data is A, the data type is Double (double-precision floating-point type), and the data transmission mode is known as the copy mode according to the function clWriteBuffer.
源程序文件中定义的操作数据可能包括多个,对于每一操作数据均执行本申请实施例所述的处理操作。The operation data defined in the source program file may include a plurality of processing operations performed by the embodiments of the present application for each operation data.
102:计算所述操作数据分别在所述第一数据传输模式和第二数据传输模式下的执行消耗时间。102: Calculate an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively.
第二数据传输模式为与该第一数据传输模式不同的数据传输模式,本申请实施例中,第一数据传输模式为复制模式时,该第二数据传输模式即为映射模式;第一数据传输模式为映射模式时,该第二数据传输模式即为复制模式。The second data transmission mode is a data transmission mode different from the first data transmission mode. In the embodiment of the present application, when the first data transmission mode is the replication mode, the second data transmission mode is a mapping mode; the first data transmission When the mode is the mapping mode, the second data transmission mode is the copy mode.
本实施例中计算操作数据分别在第一数据传输模式和第二数据传输模式下的执行消耗时间,也即分别在复制模式以及映射模式下的执行消耗时间。In this embodiment, the execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, that is, the execution consumption time in the copy mode and the mapping mode, respectively, is calculated.
该执行消耗时间包括该操作数据的数据传输时间以及设备程序执行时间。The execution consumption time includes a data transmission time of the operation data and a device program execution time.
在复制模式下,由于数据需要在主机端内存以及设备端内存中实际传输,因此数据传输时间与该操作数据的总数据量以及数据传输速率有关。In the copy mode, since the data needs to be actually transferred in the host side memory and the device side memory, the data transfer time is related to the total data amount of the operation data and the data transfer rate.
该操作数据的总数据量可以从OpenCL源程序中对操作数据的定义可知。该数据传输速率可以结合当前异构系统的执行平台的硬件特征预先确定。The total amount of data for this operational data can be known from the definition of operational data in the OpenCL source. The data transmission rate can be predetermined in conjunction with the hardware characteristics of the execution platform of the current heterogeneous system.
由于在复制模式下,数据需要在主机端和设备端复制传输,因此数据传输时间包括操作数据从主机端内存传输至设备端内存的时间,以及从设备端内存传输至主机端内存的时间,两次传输时间大致相同。因此数据传输时间可以等于操作数据的总数据量与数据传输速率的乘积的2倍。Since the data needs to be copied and transmitted on the host side and the device side in the copy mode, the data transfer time includes the time when the operation data is transferred from the host side memory to the device side memory, and the time from the device side memory to the host side memory, The secondary transmission time is approximately the same. Therefore, the data transmission time can be equal to twice the product of the total data amount of the operation data and the data transmission rate.
设备程序执行时间与设备程序执行过程中,对操作数据的内存访问总数据量以及对设备端内存的内存访问速率有关。The device program execution time is related to the total amount of memory access to the operation data and the memory access rate to the device-side memory during the execution of the device program.
对操作数据的内存访问总数据量可以根据设备程序的工作项work-item数量以及单位工作项的内存访问数据量计算得到,工作项数量以及单位工作项的内存访问数据量可以通过对源程序文件的分析得到。 The total amount of memory access to the operational data can be calculated according to the number of work items of the device program and the amount of memory access data of the unit work item. The number of work items and the amount of memory access data per unit of work items can be passed to the source program file. The analysis was obtained.
在映射模式下,操作数据并不在主机端和设备端之间真正传输,而是通过建立映射关系来实现。映射模式下的数据传输时间由建立映射关系的时间确定,其包括映射关系建立时间和映射关系消除时间,通常映射关系消除时间与映射关系建立时间相同,因此映射模式下的数据传输时间可以等于映射关系建立或消除时间的2倍。In the mapping mode, the operational data is not actually transmitted between the host and the device, but is achieved by establishing a mapping relationship. The data transmission time in the mapping mode is determined by the time when the mapping relationship is established, which includes the mapping relationship establishment time and the mapping relationship elimination time. Generally, the mapping relationship elimination time is the same as the mapping relationship establishment time, so the data transmission time in the mapping mode can be equal to the mapping. The relationship is established or eliminated twice as much.
映射模式下,设备程序执行过程中,是访问主机端的内存中的操作数据,因此设备程序执行时间与设备程序执行过程中对操作数据的内存访问总数据量以及对主机端内存的内存访问速率有关。In the mapping mode, during the execution of the device program, the operation data in the memory of the host is accessed, so the execution time of the device program is related to the total amount of memory access to the operation data during the execution of the device program and the memory access rate of the host memory. .
同样,对操作数据的内存访问总数据量可以根据设备程序的工作项work-item数量以及单位工作项的内存访问数据量计算得到。Similarly, the total amount of memory access to the operational data can be calculated based on the number of work items of the device program and the amount of memory access data per unit of work.
设备程序对设备端内存的访问速率以及对主机端的内存访问速率,可以结合当前异构向执行平台的硬件特征预先确定。The access rate of the device program to the device side memory and the memory access rate to the host side can be determined in advance in conjunction with the hardware characteristics of the current heterogeneous execution platform.
由此,可以分别计算得出操作数据在复制模式和映射模式下的执行消耗时间。Thereby, the execution consumption time of the operation data in the copy mode and the map mode can be separately calculated.
103:选择所述执行消耗时间较小的数据传输模式作为所述OpenCL源程序编译时所述操作数据的编译数据传输模式。103: Select the data transmission mode with less execution time as the compiled data transmission mode of the operation data when the OpenCL source program is compiled.
104:按照所述编译数据传输模式生成编译执行代码文件。104: Generate a compiled execution code file according to the compiled data transfer mode.
分别计算出操作数据在第一数据传输模式以及第二数据传输模式下的执行消耗时间时,即可从中选择执行消耗时间较小的数据传输模式作为所述源程序文件进行编译时所述操作数据的编译数据传输模式,从而在进行编译时,即按照该编译数据传输模式生成相应的编译执行代码文件,使得OpenCL程序执行时,操作数据按照所选择的编译数据传输模式进行传输和处理,可以缩短执行消耗时间,提高执行效率。When the execution consumption time of the operation data in the first data transmission mode and the second data transmission mode is separately calculated, the data transmission mode with less consumption time may be selected as the operation data when the source program file is compiled. Compiling the data transfer mode, so that when compiling, the corresponding compiled execution code file is generated according to the compiled data transfer mode, so that when the OpenCL program is executed, the operation data is transmitted and processed according to the selected compiled data transfer mode, which can be shortened. Execution time is spent and execution efficiency is improved.
在本实施例中,获取源程序文件,并确定出源程序中定义的操作数据的第一数据传输模式;然后通过分别计算操作数据分别在第一数据传输模式以及第二数据传输模式下的执行消耗时间,选择出执行消耗时间较小的数据传输模式作为该源程序文件编译时所述操作数据的编译数据传输模式,据此可以生成编译执行代码文件,从而程序在机器中运行时,可以按照所选择的编译数据传输模式对操作数据进行处理,缩短了执行消耗时间,可以有效提高执行效率,且将程序移植到另一个异构系统中执行时,采用本申请实施例技 术方案,可以确定出操作数据符合该异构系统的数据传输模式,从而有效保证了程序在不同异构系统中的执行效率。In this embodiment, the source program file is acquired, and the first data transmission mode of the operation data defined in the source program is determined; and then the execution of the operation data in the first data transmission mode and the second data transmission mode respectively is respectively calculated. Taking time, selecting a data transfer mode with a small execution time as a compiled data transfer mode of the operation data when the source program file is compiled, thereby generating a compiled execution code file, so that when the program is run in the machine, the program can be executed. The selected compiled data transmission mode processes the operation data, shortens the execution time, can effectively improve the execution efficiency, and when the program is transplanted to another heterogeneous system, the embodiment of the present application is adopted. The program can determine that the operational data conforms to the data transmission mode of the heterogeneous system, thereby effectively ensuring the execution efficiency of the program in different heterogeneous systems.
图2为本申请实施例一种开放运算语言OpenCL程序编译方法另一个实施例的流程图,可以包括以下几个步骤:FIG. 2 is a flowchart of another embodiment of an open computing language OpenCL program compiling method according to an embodiment of the present application, which may include the following steps:
201:获取OpenCL程序的源程序文件,并确定所述源程序文件中定义的操作数据的第一数据传输模式;201: Acquire a source program file of the OpenCL program, and determine a first data transmission mode of the operation data defined in the source program file;
202:验证所述操作数据按照第二数据传输模式处理时,所述操作数据是否安全,如果是,执行步骤203,如果否,则结束流程。202: Verify that the operation data is safe according to the second data transmission mode, and if yes, execute step 203, and if no, end the process.
该第二数据传输模式与第一数据传输模式不同,例如,当第一数据传输模式为复制模式时,该第二数据传输模式为映射模式;当第一数据传输模式为映射模式时,该第二数据传输模式为复制模式。The second data transmission mode is different from the first data transmission mode. For example, when the first data transmission mode is the replication mode, the second data transmission mode is a mapping mode; when the first data transmission mode is the mapping mode, the first The second data transmission mode is the copy mode.
本实施例中,在对源程序文件进行编译时,需要确定该操作数据若按照不同与第一数据传输模式的第二数据传输模式处理时,该操作数据是否安全,若不安全,则直接结束流程。In this embodiment, when compiling the source program file, it is required to determine whether the operation data is safe if it is processed according to the second data transmission mode of the first data transmission mode, and if it is not secure, the process data is directly ended. Process.
操作数据是否安全可以通过判断操作数据若按照第二数据传输模式进行处理时,程序执行是否出现错误来进行判断,例如操作数据在主机端和设备端是否保持一致。Whether the operation data is safe or not can be determined by judging whether the operation data is processed according to the second data transmission mode, and whether the operation of the program is erroneous, for example, whether the operation data is consistent on the host side and the device side.
作为一种可能的实现方式,As a possible implementation,
当所述第一数据传输模式为复制模式,所述第二数据传输模式为映射模式,该验证所述操作数据按照所述第二数据传输模式处理,所述操作数据是否安全可以为:When the first data transmission mode is the replication mode, the second data transmission mode is a mapping mode, and the verifying the operation data is processed according to the second data transmission mode, and whether the operation data is secure may be:
分析在程序执行过程中,是否存在主机端对所述操作数据的写操作,若否,确定所述操作数据按照所述第二数据传输模式处理时安全。It is analyzed whether there is a write operation of the operation data by the host end during the execution of the program, and if not, it is determined that the operation data is safe according to the second data transmission mode.
由于操作数据的第一数据传输模式为复制模式,主机端和设备端分别保存该操作数据,而若程序执行过程中,存在主机端对该操作数据的写操作,操作数据将会与设备端的操作数据不一致。Since the first data transmission mode of the operation data is the copy mode, the host side and the device side respectively save the operation data, and if the program side performs the write operation on the operation data, the operation data will be operated with the device side. Inconsistent data.
因此,若操作数据按照映射模式处理时,操作数据只存在与主机端,在程序执行过程中,设备端处理的操作数据与主机端的操作数据是一致的,这 将导致与操作数据在复制模式下的处理不相同,使得操作数据不安全,程序执行会发生错误。Therefore, if the operation data is processed according to the mapping mode, the operation data exists only with the host side, and during the execution of the program, the operation data processed by the device side is consistent with the operation data of the host side, This will result in different processing than the operation data in the copy mode, making the operation data unsafe and an error in program execution.
当所述第一数据模式为映射模式,所述第二数据传输模式为复制模式时,该验证所述操作数据按照所述第二数据传输模式处理,所述操作数据是否安全可以为:When the first data mode is the mapping mode, and the second data transmission mode is the copy mode, the verifying the operation data is processed according to the second data transmission mode, and whether the operation data is secure may be:
分析在程序执行过程中,是否存在设备端对所述操作数据的写操作,若否,确定所述操作数据按照所述第二数据传输模式处理时安全。It is analyzed whether there is a write operation of the operation data by the device end during the execution of the program, and if not, it is determined that the operation data is safe according to the second data transmission mode.
由于操作数据的第一数据传输模式为映射模式,操作数据只存在与主机端。而如若操作数据按照复制模式处理,主机端和设备端均存在该操作数据,设备端若存在对操作数据的写操作,此时设备端的数据就会改变,但是不会同时改变主机端的数据,因此将导致设备端和主机端的数据不一致。使得操作数据在复制模式下处理时,不安全,程序执行会发生错误。Since the first data transmission mode of the operation data is the mapping mode, the operation data exists only with the host side. If the operation data is processed according to the copy mode, the operation data exists on both the host side and the device side. If the device side has a write operation on the operation data, the data of the device side will change, but the data of the host side will not be changed at the same time. This will result in inconsistent data between the device and the host. When the operation data is processed in the copy mode, it is not safe, and an error occurs in program execution.
因此,只有在确定出操作数据若按照第二数据传输模式处理安全时,再继续执行本实施例的操作流程。Therefore, the operation flow of the embodiment is continued only when it is determined that the operation data is processed in accordance with the second data transmission mode.
其中,判断主机端或设备端是否存在对操作数据的写操作,可以通过数据流分析技术对该OpenCL源程序中对数据的定义及使用情况进行分析,以确定出是否存在主机端或设备端对操作数据的写操作。Wherein, determining whether there is a write operation on the operation data on the host end or the device end, the data flow analysis technology can be used to analyze the definition and usage of the data in the OpenCL source program to determine whether there is a host end or a device end pair. Write operation of the operation data.
203:计算所述操作数据分别在所述第一数据传输模式和第二数据传输模式下的执行消耗时间,所述第一数据传输模式与所述第二数据传输模式不同,所述执行消耗时间包括数据传输时间和设备程序执行时间。203: Calculate an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively, where the first data transmission mode is different from the second data transmission mode, and the execution consumption time is Includes data transfer time and device program execution time.
其中,该第一数据传输模式可以为复制模式,则该第二数据传输模式即可以为映射模式时;或,该第一数据传输模式为映射模式,该第二数据传输模式即可以为复制模式。The first data transmission mode may be a replication mode, and the second data transmission mode may be a mapping mode; or the first data transmission mode may be a mapping mode, and the second data transmission mode may be a replication mode. .
因此,该计算所述操作数据分别在第一数据传输模式和第二数据传输模式下的执行消耗时间可以包括:Therefore, the calculating the execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively, may include:
根据所述操作数据的总数据量以及数据传输速率,计算所述操作数据在复制模式下的数据传输时间。The data transmission time of the operation data in the copy mode is calculated according to the total data amount of the operation data and the data transmission rate.
其中,该数据传输速率可以以单位数据传输消耗时间表示,该数据传输时间可以等于操作数据的总数据量与单位数据传输消耗时间乘积的2倍。 The data transmission rate may be represented by a unit data transmission consumption time, and the data transmission time may be equal to twice the total data amount of the operation data and the unit data transmission consumption time.
根据设备程序执行过程中,对所述操作数据的内存访问总数据量以及访问设备端的内存访问速率,计算所述操作数据的设备程序执行时间。The device program execution time of the operation data is calculated according to the total amount of memory accesses to the operation data and the memory access rate of the access device end during execution of the device program.
其中,访问设备端的内存访问速率可以根据当前异构系统程序执行平台的硬件特征预先确定。The memory access rate of the access device may be determined according to the hardware characteristics of the current heterogeneous system program execution platform.
对所述操作数据的内存访问总数据量可以等于:设备程序的工作项数量以及单位工作项内存访问数据量。The total amount of memory access to the operational data may be equal to: the number of work items of the device program and the amount of memory access data per unit of work items.
工作项work-item是最小的执行单元,工作项数量表明了计算机被分割成多少单元进行处理,每一工作项的内存访问数据量可以根据该OpenCL源程序中的定义得知,具体可以通过数据流分析技术分析得出。The work item work-item is the smallest execution unit. The number of work items indicates how many units the computer is divided into. The amount of memory access data for each work item can be known according to the definition in the OpenCL source program. Analysis of flow analysis techniques.
将所述复制模式下计算的数据传输时间以及设备程序执行时间之和,作为所述操作数据在复制模式下的执行消耗时间。The sum of the data transfer time calculated in the copy mode and the device program execution time is taken as the execution time of the operation data in the copy mode.
根据主机端与设备端的映射关系建立和消除时间,计算操作数据在映射模式下的数据传输时间。The time is established and eliminated according to the mapping relationship between the host end and the device end, and the data transmission time of the operation data in the mapping mode is calculated.
映射关系建立和消除时间可以根据当异构系统执行平台的硬件特征预先确定。The mapping relationship establishment and elimination time can be predetermined based on the hardware characteristics of the heterogeneous system execution platform.
根据设备程序执行过程中,对所述操作数据的内存访问总数据量以及访问主机端的内存访问速率,计算所述操作数据的设备程序执行时间。The device program execution time of the operation data is calculated according to the total amount of memory access of the operation data and the memory access rate of the access host during execution of the device program.
其中,访问主机端的内存访问速率可以根据当异构系统执行平台的硬件特征预先确定。对所述操作数据的内存访问总数据量可以等于:设备程序的工作项数量以及单位工作项内存访问数据量的乘积。The memory access rate of the access host may be predetermined according to the hardware characteristics of the heterogeneous system execution platform. The total amount of memory access to the operational data may be equal to the product of the number of work items of the device program and the amount of memory access data per unit of work item.
将所述映射模式下计算的数据传输时间以及设备程序执行时间之和,作为所述操作数据在映射模式下的执行消耗时间。The sum of the data transmission time calculated in the mapping mode and the device program execution time is taken as the execution time of the operation data in the mapping mode.
204:选择所述执行消耗时间较小的数据传输模式作为所述源程序文件编译时所述操作数据的编译数据传输模式。204: Select the data transmission mode with less execution time as the compiled data transmission mode of the operation data when the source program file is compiled.
205:按照所述编译数据传输模式生成编译执行代码文件。205: Generate a compiled execution code file according to the compiled data transmission mode.
通过计算操作数据分别按照复制模式和映射模式处理时的执行消耗时间,选择执行消耗时间较小的数据传输模式作为操作数据的编译数据传输模式,该编译数据传输模式可能是该操作数据的第一数据传输模式,或者不同与第一数据传输模式的第二数据传输模式。 By calculating the execution consumption time when the operation data is processed according to the copy mode and the mapping mode, respectively, selecting a data transmission mode that consumes less time is used as the compiled data transmission mode of the operation data, and the compiled data transmission mode may be the first of the operation data. The data transmission mode, or a second data transmission mode different from the first data transmission mode.
通过选择执行消耗时间较小的编译数据传输模式,可以使得OpenCL程序在机器上运行时,减小执行时间,提高执行效率。By selecting a compiled data transfer mode that consumes less time, the OpenCL program can be run on the machine, reducing execution time and improving execution efficiency.
在本实施例中,获取源程序文件,并确定出源程序中定义的操作数据的第一数据传输模式,对该操作数据进行验证,若其按照第二数据传输模式处理,安全时,则分别计算操作数据分别在第一数据传输模式以及第二数据传输模式下的执行消耗时间,选择出执行消耗时间较小的数据传输模式作为编译时所述操作数据的的编译数据传输模式,据此可以生成编译执行代码文件,从而程序在机器中运行时,可以按照所选择的编译数据传输模式对操作数据进行处理,缩短了执行消耗时间,可以有效提高执行效率,且将程序移植到另一个异构系统中执行时,采用本申请实施例技术方案,可以确定出操作数据符合该异构系统的数据传输模式,从而保证了程序在不同异构系统中的执行效率。In this embodiment, the source file is obtained, and the first data transmission mode of the operation data defined in the source program is determined, and the operation data is verified. If it is processed according to the second data transmission mode, when it is secure, Calculating an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively, and selecting a data transmission mode in which the consumption time is small as a compiled data transmission mode of the operation data at the time of compiling, according to which Generate a compiled execution code file, so that when the program is running in the machine, the operation data can be processed according to the selected compiled data transmission mode, the execution time is shortened, the execution efficiency can be effectively improved, and the program is transplanted to another heterogeneous When the system is executed, the technical solution of the embodiment of the present application can be used to determine that the operation data conforms to the data transmission mode of the heterogeneous system, thereby ensuring the execution efficiency of the program in different heterogeneous systems.
下面结合一个实际应用场景,来详细介绍本申请技术方案,图3为本申请实施例一种开放运算语言OpenCL程序编译方法另一个实施例的流程图,本实施例中,以如下OpenCL程序的源程序文件中的一段片段为例:The following is a detailed description of the technical solution of the present application in conjunction with a practical application scenario. FIG. 3 is a flowchart of another embodiment of an open computing language OpenCL program compiling method according to an embodiment of the present application. In this embodiment, the source of the following OpenCL program is used. A fragment of the program file is an example:
Figure PCTCN2014085885-appb-000001
Figure PCTCN2014085885-appb-000001
Figure PCTCN2014085885-appb-000002
Figure PCTCN2014085885-appb-000002
由该段源程序文件片段可知,该源程序文件片段中定义的操作数据包括操作数据A和操作数据B,其数据传输模式均为复制模式。下面主要以操作数据A为例进行介绍,对于操作数据B其处理过程与操作数据A类似,不再赘述。It can be seen from the segment source file fragment that the operation data defined in the source program file segment includes the operation data A and the operation data B, and the data transmission mode is the copy mode. The following mainly introduces the operation data A as an example. For the operation data B, the processing procedure is similar to the operation data A, and will not be described again.
该方法可以包括如下几个步骤:The method can include the following steps:
301:获取OpenCL程序的源程序文件,并确定所述源程序文件中定义的操作数据A的第一数据传输模式为复制模式。301: Acquire a source program file of the OpenCL program, and determine that the first data transmission mode of the operation data A defined in the source program file is a copy mode.
由代码“clWriteBuffer(d_A,65536*8,h_A[0],…)”可知,该操作数据A的第一数据传输模式为复制模式。It can be seen from the code "clWriteBuffer(d_A, 65536*8, h_A[0], ...)" that the first data transmission mode of the operation data A is the copy mode.
302:验证所述操作数据A按照映射模式处理时,所述操作数据是否安全,如果是,执行步骤103,如果否,则结束流程。302: Verify whether the operation data is safe according to the mapping mode, and if yes, execute step 103, and if no, end the process.
通过数据流分析技术可知,该程序中,不存在主机端对操作数据A的写操作,因此操作数据A安全。According to the data stream analysis technology, in the program, there is no write operation of the operation data A by the host side, and therefore the operation data A is secure.
303:计算所述操作数据A在复制模式下的数据传输时间和设备程序执行时间,得到所述操作数据A在复制模式下的执行消耗时间。303: Calculate a data transmission time and a device program execution time of the operation data A in the copy mode, and obtain an execution consumption time of the operation data A in the copy mode.
操作数据A在复制模式下数据传输时间Ct1=Vt*St*2。 The data transfer time Ct1=Vt*St*2 of the operation data A in the copy mode.
其中,Vt为操作数据A的总数据量,由上述程序可知操作数据A的数量类型为双精度浮点型,占用8字节,而操作数据A的向量长度为65536,因此,因此操作数据A的总数据量为65536*8B(字节)。Where Vt is the total data amount of the operation data A. It can be known from the above program that the quantity type of the operation data A is a double-precision floating-point type, occupies 8 bytes, and the vector length of the operation data A is 65536, therefore, the operation data A The total amount of data is 65536*8B (bytes).
St为单位数据消耗时间,用以表示数据传输速率,本实施例中,假设其为4cycle/B(4时钟周期每字节)。St is the unit data consumption time to indicate the data transmission rate. In this embodiment, it is assumed to be 4 cycles/B (4 clock cycles per byte).
由于数据需要来回传输两次,因此复制模式下的数据传输时间可以为65536*8*4*2=4Mcycle。Since the data needs to be transmitted back and forth twice, the data transfer time in the copy mode can be 65536*8*4*2=4Mcycle.
操作数据A在复制模式下的设备程序执行时间Ca1=Va*Sab。The device program execution time Ca1=Va*Sab of the operation data A in the copy mode.
其中,Va为设备程序对操作数据A的内存访问总数据量,Va=Ka*Nwi,其中,Nwi为工作项work-item数量,工作项为最小执行单元,由上述程序可知,工作项可以为操作数据中的每一数据分组,Nwi为65536。Ka为设备程序的单位工作项的内存访问数据量,也即每一工作项对应的内存访问数量量。且由程序可以得知,设备程序执行时访问1次操作数据B和1/4次操作数据A,因此Va=1/4*65536*8=128KB。Wherein, Va is the total amount of memory access of the device program to the operation data A, Va=Ka*Nwi, wherein Nwi is the number of work item work-item, and the work item is the minimum execution unit, the work item can be known as For each data packet in the operational data, Nwi is 65536. Ka is the amount of memory access data for the unit work item of the device program, that is, the amount of memory access corresponding to each work item. It can be known from the program that when the device program is executed, the operation data B and the 1/4 operation data A are accessed once, so Va=1/4*65536*8=128 KB.
由于在复制模式下,设备端存储有该操作数据A,因此该Sab即是指设备程序访问设备端内存时,单位数据消耗时间,用以表示对设备端内存的内存访问速率。假设为4cycle/B。Since the operation data A is stored on the device side in the copy mode, the Sab refers to the unit data consumption time when the device program accesses the device-side memory, and is used to indicate the memory access rate to the device-side memory. Assume 4cycle/B.
则复制模式下设备程序执行时间为128KB*4cycle/B=0.5Mcycle。The device program execution time in the copy mode is 128KB*4cycle/B=0.5Mcycle.
从而可以计算得出复制模式下的执行消耗时间C1=Ct1+Ca1=4.5 Mcycle。Thereby, the execution consumption time C1=Ct1+Ca1=4.5 Mcycle in the copy mode can be calculated.
304:计算所述操作数据A在映射模式下的数据传输时间和设备程序执行时间,得到所述操作数据A在映射模式下的执行消耗时间。304: Calculate a data transmission time of the operation data A in the mapping mode and a device program execution time, and obtain an execution consumption time of the operation data A in the mapping mode.
若操作数据A的数据传输模式更改为映射模式,在数据传输阶段只是用于建立和消除映射关系,假设建立或消除映射关系的时间为tm=10Kcycle。则该映射模式下的数据传输时间Ct2=2*tm=0.02Mcycle。If the data transmission mode of the operation data A is changed to the mapping mode, it is only used to establish and eliminate the mapping relationship in the data transmission phase, and it is assumed that the time for establishing or eliminating the mapping relationship is tm=10Kcycle. Then, the data transmission time in the mapping mode is Ct2=2*tm=0.02 Mcycle.
在映射模式下,设备程序执行时间Ca2=Va*Sam。In the mapping mode, the device program execution time Ca2 = Va * Sam.
由上述描述可知,Va=128KB。As can be seen from the above description, Va = 128 KB.
Sam为设备程序访问主机端内存时,单位数据消耗时间,用以对主机端内存的内存访问速率,假设为16cycle/B.When Sam accesses the host-side memory, the unit data consumes time, and the memory access rate to the host-side memory is assumed to be 16cycle/B.
从而,可以计算得出映射模式的设备程序执行时间Ca2=128KB*16cycle/B=2Mcycle。 Thus, the device program execution time of the mapping mode can be calculated as Ca2=128KB*16cycle/B=2Mcycle.
则映射模式下的执行消耗时间C2=Ct2+Ca2=2.022Mcycle。Then, the execution time in the mapping mode consumes time C2 = Ct2 + Ca2 = 2.022 Mcycle.
305:比较所述操作数据A在复制模式下的执行消耗时间以及所述操作数据A在映射模式下的执行消耗时间。305: Compare the execution consumption time of the operation data A in the copy mode and the execution consumption time of the operation data A in the mapping mode.
306:选择执行消耗时间较小的映射模式为所述操作数据A在编译时的编译数据模式。306: Select to execute a mapping mode with a small consumption time as a compiled data mode of the operation data A at compile time.
307:生成编译执行代码文件。307: Generate a compiled execution code file.
由步骤304的计算结果可以得知,映射模式下的执行消耗时间较小,因此映射模式即为操作数据A在编译时的编译数据模式,从而在进行编译时,将操作数据A的数据传输模式进行更改,生成映射模式对应的编译执行代码文件。It can be known from the calculation result of step 304 that the execution consumption time in the mapping mode is small, so the mapping mode is the compiled data mode of the operation data A at compile time, so that when compiling, the data transmission mode of the operation data A will be operated. Make changes and generate a compiled execution code file corresponding to the mapping mode.
OpenCL程序在机器中运行时,对与该操作数据A,即按照映射模式进行处理,使得可以减少操作数据A的执行处理时间,提高了程序执行效率。When the OpenCL program is run in the machine, the operation data A is processed in accordance with the mapping mode, so that the execution processing time of the operation data A can be reduced, and the program execution efficiency is improved.
且当该OpenCL程序移植到另一异构系统时,可以按照本申请实施例方案确定出该操作数据A符合另一异构系统的数据传输模式,以保证在该另一异构系统中的程序执行效率。And when the OpenCL program is migrated to another heterogeneous system, the operation data A can be determined according to the solution of the embodiment of the present application to conform to the data transmission mode of another heterogeneous system to ensure the program in the other heterogeneous system. effectiveness.
对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。For the foregoing method embodiments, for the sake of brevity, they are all described as a series of action combinations, but those skilled in the art should understand that the present application is not limited by the described action sequence, because according to the present application, Some steps can be performed in other orders or at the same time. In the following, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present application.
图4为本申请实施例提供的一种编译器一个实施例的结构示意图,该编译器可以包括:FIG. 4 is a schematic structural diagram of an embodiment of a compiler according to an embodiment of the present disclosure, where the compiler may include:
模式确定模块401,用于获取OpenCL程序的源程序文件,并确定所述源程序文件中定义的操作数据的第一数据传输模式。The mode determining module 401 is configured to acquire a source program file of the OpenCL program, and determine a first data transmission mode of the operation data defined in the source program file.
该第一数据传输模式可以是复制模式或者映射模式,由源程序文件中定义的函数可知。The first data transmission mode may be a copy mode or a mapping mode, as known by a function defined in the source program file.
计算模块402,用于计算所述操作数据分别在所述第一数据传输模式和第二数据传输模式下的执行消耗时间。 The calculating module 402 is configured to calculate an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively.
其中,所述第二数据传输模式与所述第一数据传输模式不同。第一数据传输模式为复制模式时,该第二数据传输模式即为映射模式;第一数据传输模式为映射模式时,该第二数据传输模式即为复制模式。The second data transmission mode is different from the first data transmission mode. When the first data transmission mode is the copy mode, the second data transmission mode is the mapping mode; when the first data transmission mode is the mapping mode, the second data transmission mode is the copy mode.
本实施例中计算操作数据分别在第一数据传输模式和第二数据传输模式下的执行消耗时间,也即分别在复制模式以及映射模式下的执行消耗时间。In this embodiment, the execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, that is, the execution consumption time in the copy mode and the mapping mode, respectively, is calculated.
所述执行消耗时间包括所述操作数据的数据传输时间和设备程序执行时间。The execution consumption time includes a data transmission time of the operation data and a device program execution time.
模式选择模块403,用于选择消耗时间较小的数据传输模式作为所述OpenCL源程序编译时所述操作数据的编译数据传输模式。The mode selection module 403 is configured to select a data transmission mode with a small consumption time as a compiled data transmission mode of the operation data when the OpenCL source program is compiled.
编译模块404,用于按照所述编译数据传输模式生成编译执行代码文件。The compiling module 404 is configured to generate a compiled execution code file according to the compiled data transmission mode.
分别计算出操作数据在第一数据传输模式以及第二数据传输模式下的执行消耗时间时,即可从中选择执行消耗时间较小的数据传输模式作为所述OpenCL源程序编译时所述操作数据的编译数据传输模式,从而在进行编译时,即按照该编译数据传输模式生成相应的编译执行代码文件,使得OpenCL程序执行时,操作数据按照所选择的编译数据传输模式进行传输和处理,可以缩短执行消耗时间,提高执行效率。When calculating the execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively, the data transmission mode with less consumption time can be selected as the operation data of the OpenCL source program compiling time. Compiling the data transfer mode, so that when compiling, the corresponding compiled execution code file is generated according to the compiled data transfer mode, so that when the OpenCL program is executed, the operation data is transmitted and processed according to the selected compiled data transfer mode, and the execution can be shortened. It takes time to improve execution efficiency.
在本实施例中,编译器获取源程序文件进行编译时,首先确定出源程序中定义的操作数据的第一数据传输模式;然后通过分别计算操作数据分别在第一数据传输模式以及第二数据传输模式下的执行消耗时间,选择出执行消耗时间较小的数据传输模式作为编译时所述操作数据的的编译数据传输模式,据此可以生成编译执行代码文件,从而程序在机器中运行时,可以按照所选择的编译数据传输模式对操作数据进行处理,缩短了执行消耗时间,可以有效提高执行效率,且将程序移植到另一个异构系统中执行时,采用本申请实施例技术方案,可以确定出操作数据符合该异构系统的数据传输模式,从而保证了程序在不同异构系统中的执行效率。In this embodiment, when the compiler acquires the source program file for compiling, first determines a first data transmission mode of the operation data defined in the source program; and then separately calculates the operation data in the first data transmission mode and the second data respectively. The execution time in the transfer mode consumes time, and the data transfer mode in which the consumption time is small is selected as the compiled data transfer mode of the operation data at the time of compiling, whereby the compiled execution code file can be generated, so that when the program is run in the machine, The operation data can be processed according to the selected compiled data transmission mode, the execution time is shortened, the execution efficiency can be effectively improved, and the program is transplanted to another heterogeneous system, and the technical solution of the embodiment of the present application can be used. It is determined that the operational data conforms to the data transmission mode of the heterogeneous system, thereby ensuring the execution efficiency of the program in different heterogeneous systems.
图5为本申请实施例提供的一种编译器另一个实施例的结构示意图,该编译器可以包括:FIG. 5 is a schematic structural diagram of another embodiment of a compiler according to an embodiment of the present disclosure, where the compiler may include:
模式确定模块501,用于获取OpenCL程序的源程序文件,并确定所述源程序文件中定义的操作数据的第一数据传输模式。 The mode determining module 501 is configured to acquire a source program file of the OpenCL program, and determine a first data transmission mode of the operation data defined in the source program file.
验证模块501,用于验证所述操作数据按照第二数据传输模式处理时,所述操作数据是否安全。The verification module 501 is configured to verify whether the operation data is safe when the operation data is processed according to the second data transmission mode.
本实施例中,在对源程序文件进行编译时,需要确定该操作数据若按照不同与第一数据传输模式的第二数据传输模式处理时,该操作数据是否安全,若不安全,则直接结束流程。In this embodiment, when compiling the source program file, it is required to determine whether the operation data is safe if it is processed according to the second data transmission mode of the first data transmission mode, and if it is not secure, the process data is directly ended. Process.
操作数据是否安全可以通过判断操作数据若按照第二数据传输模式进行处理时,程序执行是否出现错误来进行判断,例如操作数据在主机端和设备端是否保持一致。Whether the operation data is safe or not can be determined by judging whether the operation data is processed according to the second data transmission mode, and whether the operation of the program is erroneous, for example, whether the operation data is consistent on the host side and the device side.
作为一种可能的实现方式,As a possible implementation,
该验证模块具体可以用于当所述第一数据传输模式为复制模式,所述第二数据传输模式为映射模式时,分析在程序执行过程中,是否存在主机端对所述操作数据的写操作,若否,确定所述操作数据按照所述第二数据传输模式处理时安全;当所述第一数据模式为映射模式,所述第二数据传输模式为复制模式时,分析在程序执行过程中,是否存在设备端对所述操作数据的写操作,若否,确定所述操作数据按照所述第二数据传输模式处理时安全。The verification module may be specifically configured to: when the first data transmission mode is a replication mode, and when the second data transmission mode is a mapping mode, analyze whether a host side writes the operation data during a program execution process. If not, determining that the operation data is safe according to the second data transmission mode; when the first data mode is a mapping mode, and the second data transmission mode is a replication mode, analyzing is performed during program execution Whether there is a write operation of the operation data by the device end, and if not, determining that the operation data is safe according to the second data transmission mode.
其中,判断主机端或设备端是否存在对操作数据的写操作,可以通过数据流分析技术对源程序文件中对数据的定义及使用情况进行分析,以确定出是否存在主机端或设备端对操作数据的写操作。Wherein, determining whether the host side or the device end has a write operation on the operation data, the data stream analysis technology may be used to analyze the definition and usage of the data in the source program file to determine whether there is a host end or a device end pair operation. Data write operation.
计算模块502,用于当所述验证模块501验证述操作数据安全时,计算所述操作数据分别在第一数据传输模式和第二数据传输模式下时的程序执行消耗时间。The calculating module 502 is configured to calculate a program execution consumption time when the operation data is in the first data transmission mode and the second data transmission mode, respectively, when the verification module 501 verifies the operation data security.
所述第二数据传输模式与所述第一数据传输模式不同,所述执行消耗时间包括所述操作数据的数据传输时间和设备程序执行时间。The second data transmission mode is different from the first data transmission mode, and the execution consumption time includes a data transmission time of the operation data and a device program execution time.
其中,该第一数据传输模式可以为复制模式,则该第二数据传输模式即可以为映射模式时;或,该第一数据传输模式为映射模式,该第二数据传输模式即可以为复制模式。The first data transmission mode may be a replication mode, and the second data transmission mode may be a mapping mode; or the first data transmission mode may be a mapping mode, and the second data transmission mode may be a replication mode. .
因此,参见图6所示,该计算模块可以具体包括:Therefore, referring to FIG. 6, the computing module may specifically include:
第一传输时间计算模块601,用于根据所述操作数据的总数据量以及数据传输速率,计算复制模式下所述操作数据的数据传输时间。 The first transmission time calculation module 601 is configured to calculate a data transmission time of the operation data in the replication mode according to the total data amount of the operation data and the data transmission rate.
其中,该数据传输速率可以以单位数据传输消耗时间表示,该数据传输时间可以等于操作数据的总数据量与单位数据传输消耗时间乘积的2倍。The data transmission rate may be represented by a unit data transmission consumption time, and the data transmission time may be equal to twice the total data amount of the operation data and the unit data transmission consumption time.
第一执行时间计算模块602,用于根据设备程序执行过程中,对所述操作数据的内存访问总数据量以及访问设备端的内存访问速率,计算复制模式下所述操作数据的设备程序执行时间。The first execution time calculation module 602 is configured to calculate a device program execution time of the operation data in the replication mode according to a total amount of memory accesses to the operation data and a memory access rate of the access device end during execution of the device program.
所述对所述操作数据的内存访问总数据量为根据源程序文件中定义的设备程序的工作项数量以及单位工作项的内存访问数据量计算得到。The total amount of memory accesses to the operation data is calculated according to the number of work items of the device program defined in the source program file and the amount of memory access data of the unit work item.
其中,访问设备端的内存访问速率可以根据当前异构系统程序执行平台的硬件特征预先确定。The memory access rate of the access device may be determined according to the hardware characteristics of the current heterogeneous system program execution platform.
对所述操作数据的内存访问总数据量可以等于:设备程序的工作项数量以及单位工作项内存访问数据量的乘积。The total amount of memory access to the operational data may be equal to the product of the number of work items of the device program and the amount of memory access data per unit of work item.
工作项work-item是最小的执行单元,工作项数量表明了计算机被分割成多少单元进行处理,每一工作项的内存访问数据量可以根据该源程序文件中的定义得知,具体可以通过数据流分析技术分析得出。The work item work-item is the smallest execution unit. The number of work items indicates how many units the computer is divided into. The amount of memory access data of each work item can be known according to the definition in the source program file. Analysis of flow analysis techniques.
第一消耗时间计算模块603,用于将将所述复制模式下计算的数据传输时间以及设备程序执行时间之和,作为所述操作数据在复制模式下的执行消耗时间。The first consumption time calculation module 603 is configured to use the sum of the data transmission time calculated in the copy mode and the device program execution time as the execution consumption time of the operation data in the copy mode.
第二传输时间计算模块604,用于根据主机端与设备端的映射关系建立以及消除时间,计算映射模式下所述操作数据的数据传输时间。The second transmission time calculation module 604 is configured to calculate and eliminate the time according to the mapping relationship between the host end and the device end, and calculate the data transmission time of the operation data in the mapping mode.
映射关系建立和消除时间可以根据当异构系统执行平台的硬件特征预先确定。The mapping relationship establishment and elimination time can be predetermined based on the hardware characteristics of the heterogeneous system execution platform.
第二执行时间计算模块605,用于根据设备程序执行过程中,对所述操作数据的内存访问总数据量以及访问主机端的内存访问速率,计算映射模式下所述操作数据的设备程序执行时间。The second execution time calculation module 605 is configured to calculate a device program execution time of the operation data in the mapping mode according to a total amount of memory accesses to the operation data and a memory access rate of the access host end during execution of the device program.
其中,访问主机端的内存访问速率可以根据当异构系统执行平台的硬件特征预先确定。对所述操作数据的内存访问总数据量可以等于:设备程序的工作项数量以及单位工作项内存访问数据量的乘积。The memory access rate of the access host may be predetermined according to the hardware characteristics of the heterogeneous system execution platform. The total amount of memory access to the operational data may be equal to the product of the number of work items of the device program and the amount of memory access data per unit of work item.
第二消耗时间计算模块606,用于将所述映射模式下计算的数据传输时间以及设备程序执行之和,作为所述操作数据在映射模式下的执行消耗时间。 The second consumption time calculation module 606 is configured to use the sum of the data transmission time calculated in the mapping mode and the device program execution as the execution time of the operation data in the mapping mode.
模式选择模块503,用于选择消耗时间较小的数据传输模式作为所述OpenCL源程序编译时所述操作数据的编译数据传输模式。The mode selection module 503 is configured to select a data transmission mode with a small consumption time as a compiled data transmission mode of the operation data when the OpenCL source program is compiled.
编译模块504,用于按照所述编译数据传输模式生成编译执行代码文件。The compiling module 504 is configured to generate a compiled execution code file according to the compiled data transmission mode.
通过计算操作数据分别按照复制模式和映射模式处理时的执行消耗时间,选择执行消耗时间较小的数据传输模式作为操作数据的编译数据传输模式,该编译数据传输模式可能是该操作数据的第一数据传输模式,或者不同与第一数据传输模式的第二数据传输模式。By calculating the execution consumption time when the operation data is processed according to the copy mode and the mapping mode, respectively, selecting a data transmission mode that consumes less time is used as the compiled data transmission mode of the operation data, and the compiled data transmission mode may be the first of the operation data. The data transmission mode, or a second data transmission mode different from the first data transmission mode.
通过选择执行消耗时间较小的编译数据传输模式,可以使得OpenCL程序在机器上运行时,减小执行时间,提高执行效率。By selecting a compiled data transfer mode that consumes less time, the OpenCL program can be run on the machine, reducing execution time and improving execution efficiency.
在本实施例中,编译器获取源程序文件,并确定出源程序文件中定义的操作数据的第一数据传输模式,对该操作数据进行验证,若其按照第二数据传输模式处理,安全时,则分别计算操作数据分别在第一数据传输模式以及第二数据传输模式下的执行消耗时间,选择出执行消耗时间较小的数据传输模式作为编译时所述操作数据的的编译数据传输模式,据此可以生成编译执行代码文件,从而程序在机器中运行时,可以按照所选择的编译数据传输模式对操作数据进行处理,缩短了执行消耗时间,可以有效提高执行效率,且将程序移植到另一个异构系统中执行时,采用本申请实施例技术方案,可以确定出操作数据符合该异构系统的数据传输模式,从而保证了程序在不同异构系统中的执行效率。In this embodiment, the compiler obtains the source program file, and determines a first data transmission mode of the operation data defined in the source program file, and verifies the operation data. If it is processed according to the second data transmission mode, the security time is And calculating an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively, and selecting a data transmission mode in which the consumption time is small as a compiled data transmission mode of the operation data at the time of compiling, According to this, the compiled execution code file can be generated, so that when the program is running in the machine, the operation data can be processed according to the selected compiled data transmission mode, the execution time is shortened, the execution efficiency can be effectively improved, and the program can be transplanted to another When performing in a heterogeneous system, the technical solution of the embodiment of the present application can be used to determine that the operational data conforms to the data transmission mode of the heterogeneous system, thereby ensuring the execution efficiency of the program in different heterogeneous systems.
上述实施例所述的编译器在实际应用中应用于计算设备中,部署本申请实施例所述编译器的计算设备可以实现源程序文件的编译,将源程序文件编译成机器可识别的代码,可以为源程序文件中定义的操作数据选择执行消耗时间小的数据传输模式进行编译,使得程序运行时的执行时间减小,程序执行效率提高。The compiler described in the foregoing embodiment is applied to the computing device in a practical application. The computing device that deploys the compiler in the embodiment of the present application can implement the compilation of the source program file, and compile the source program file into a machine-recogable code. The data transfer mode with low consumption time can be compiled for the operation data defined in the source program file, so that the execution time of the program is reduced and the program execution efficiency is improved.
通过以上描述可知,本领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件平台的方式来实现。因此,参见图7,本申请实施例还提供了一种计算设备,该计算设备至少可以包括存储器701以及通过总线与存储器701连接的处理器702, From the above description, those skilled in the art can clearly understand that the present application can be implemented by means of software plus a necessary general hardware platform. Therefore, referring to FIG. 7, the embodiment of the present application further provides a computing device, where the computing device includes at least a memory 701 and a processor 702 connected to the memory 701 via a bus.
该存储器701存储一组程序指令。该存储器701可以是是高速RAM存储器,也可能是非易失性存储器(non-volatile memory),例如至少一个磁盘存储器等。The memory 701 stores a set of program instructions. The memory 701 may be a high speed RAM memory or a non-volatile memory such as at least one disk memory or the like.
该处理器702用于调用所述存储器701存储的程序指令,执行如下操作:The processor 702 is configured to invoke a program instruction stored by the memory 701, and perform the following operations:
获取OpenCL程序的源程序文件,并确定所述源程序文件中定义的操作数据的第一数据传输模式;Obtaining a source program file of the OpenCL program, and determining a first data transmission mode of the operation data defined in the source program file;
计算所述操作数据分别在所述第一数据传输模式和第二数据传输模式下的执行消耗时间,所述第二数据传输模式与所述第一数据传输模式不同,所述执行消耗时间包括所述操作数据的数据传输时间和设备程序执行时间;Calculating an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively, the second data transmission mode is different from the first data transmission mode, and the execution consumption time includes The data transmission time of the operation data and the execution time of the device program;
选择所述执行消耗时间较小的数据传输模式作为所述源程序文件编译时所述操作数据的编译数据传输模式;Selecting, by the execution, a data transmission mode with a small consumption time as a compiled data transmission mode of the operation data when the source program file is compiled;
按照所述编译数据传输模式生成编译执行代码文件。A compiled execution code file is generated in accordance with the compiled data transfer mode.
该处理器702可能是一个中央处理器CPU,或者是特定集成电路ASIC(Application Specific Integrated Circuit),或者是被配置成实施本发明实施例的一个或多个集成电路。The processor 702 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention.
可选地,该计算设备可以用于执行本申请实施例提供的图1-图2所示的任一OpenCL程序编译方法。Optionally, the computing device can be used to execute any of the OpenCL program compilation methods shown in FIG. 1 to FIG. 2 provided by the embodiments of the present application.
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。The various embodiments in the present specification are described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same similar parts between the various embodiments may be referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant parts can be referred to the method part.
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。 Finally, it should also be noted that in this context, relational terms such as first and second are used merely to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities. There is any such actual relationship or order between operations. Furthermore, the term "comprises" or "comprises" or "comprises" or any other variations thereof is intended to encompass a non-exclusive inclusion, such that a process, method, article, or device that comprises a plurality of elements includes not only those elements but also Other elements, or elements that are inherent to such a process, method, item, or device. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, method, item, or device that comprises the element.
为了描述的方便,描述以上装置时以功能分为各种单元分别描述。当然,在实施本申请时可以把各单元的功能在同一个或多个软件和/或硬件中实现。For the convenience of description, the above devices are described separately by function into various units. Of course, the functions of each unit may be implemented in the same software or software and/or hardware when implementing the present application.
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例或者实施例的某些部分所述的方法。It will be apparent to those skilled in the art from the above description of the embodiments that the present application can be implemented by means of software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product in essence or in the form of a software product, which may be stored in a storage medium such as a ROM/RAM or a disk. , an optical disk, etc., includes instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform the methods described in various embodiments of the present application or portions of the embodiments.
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本申请。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本申请的精神或范围的情况下,在其它实施例中实现。因此,本申请将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。 The above description of the disclosed embodiments enables those skilled in the art to make or use the application. Various modifications to these embodiments are obvious to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the application. Therefore, the application is not limited to the embodiments shown herein, but is to be accorded the broadest scope of the principles and novel features disclosed herein.

Claims (10)

  1. 一种开放运算语言OpenCL程序编译方法,其特征在于,包括:An open computing language OpenCL program compiling method, which comprises:
    获取OpenCL程序的源程序文件,并确定所述源程序文件中定义的操作数据的第一数据传输模式;Obtaining a source program file of the OpenCL program, and determining a first data transmission mode of the operation data defined in the source program file;
    计算所述操作数据分别在所述第一数据传输模式和第二数据传输模式下的执行消耗时间,所述第二数据传输模式与所述第一数据传输模式不同,所述执行消耗时间包括所述操作数据的数据传输时间和设备程序执行时间;Calculating an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively, the second data transmission mode is different from the first data transmission mode, and the execution consumption time includes The data transmission time of the operation data and the execution time of the device program;
    选择所述执行消耗时间较小的数据传输模式作为所述源程序文件编译时所述操作数据的编译数据传输模式;Selecting, by the execution, a data transmission mode with a small consumption time as a compiled data transmission mode of the operation data when the source program file is compiled;
    按照所述编译数据传输模式生成编译执行代码文件。A compiled execution code file is generated in accordance with the compiled data transfer mode.
  2. 根据权利要求1所述的方法,其特征在于,所述计算所述操作数据分别在所述第一数据传输模式和第二数据传输模式下的程序执行消耗时间包括:The method according to claim 1, wherein the calculating the program execution consumption time of the operation data in the first data transmission mode and the second data transmission mode respectively comprises:
    验证所述操作数据按照所述第二数据传输模式处理时,所述操作数据是否安全;Verifying whether the operation data is safe when the operation data is processed according to the second data transmission mode;
    当所述操作数据安全时,计算所述操作数据分别在所述第一数据传输模式和所述第二数据传输模式下的程序执行消耗时间。When the operation data is secure, calculating a program execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively.
  3. 根据权利要求2所述的方法,其特征在于,当所述第一数据传输模式为复制模式,所述第二数据传输模式为映射模式,所述验证所述操作数据按照所述第二数据传输模式处理,所述操作数据是否安全包括:The method according to claim 2, wherein when the first data transmission mode is a copy mode, the second data transmission mode is a mapping mode, and the verifying the operation data is performed according to the second data transmission Mode processing, whether the operation data is safe or not includes:
    分析在程序执行过程中,是否存在主机端对所述操作数据的写操作,若否,确定所述操作数据按照所述第二数据传输模式处理时安全;Analyzing whether there is a write operation of the operation data by the host end during the execution of the program, and if not, determining that the operation data is safe according to the second data transmission mode;
    当所述第一数据模式为映射模式,所述第二数据传输模式为复制模式时,所述验证所述操作数据按照所述第二数据传输模式处理,所述操作数据是否安全包括:When the first data mode is the mapping mode, and the second data transmission mode is the copy mode, the verifying the operation data is processed according to the second data transmission mode, and whether the operation data is secure comprises:
    分析在程序执行过程中,是否存在设备端对所述操作数据的写操作,若否,确定所述操作数据按照所述第二数据传输模式处理时安全。 It is analyzed whether there is a write operation of the operation data by the device end during the execution of the program, and if not, it is determined that the operation data is safe according to the second data transmission mode.
  4. 根据权利要求1~3任一项所述的方法,其特征在于,所述第一数据传输模式为复制模式时,所述第二数据传输模式为映射模式;或,所述第一数据传输模式为映射模式时,所述第二数据传输模式为复制模式;The method according to any one of claims 1 to 3, wherein, when the first data transmission mode is a copy mode, the second data transmission mode is a mapping mode; or the first data transmission mode When the mode is mapped, the second data transmission mode is a copy mode;
    所述计算所述操作数据分别在第一数据传输模式和第二数据传输模式下的执行消耗时间包括:The calculating the execution consumption time of the operation data in the first data transmission mode and the second data transmission mode respectively includes:
    根据所述操作数据的总数据量以及数据传输速率,计算复制模式下所述操作数据的数据传输时间;Calculating a data transmission time of the operation data in the copy mode according to the total data amount of the operation data and the data transmission rate;
    根据设备程序执行过程中,对所述操作数据的内存访问总数据量以及访问设备端的内存访问速率,计算所述复制模式下所述操作数据的设备程序执行时间;Calculating a device program execution time of the operation data in the copy mode according to a total amount of memory accesses to the operation data and a memory access rate of the access device end during execution of the device program;
    将所述复制模式下计算的数据传输时间以及设备程序执行时间之和,作为所述操作数据在所述复制模式下的执行消耗时间;And a sum of a data transmission time calculated in the copy mode and a device program execution time as an execution time of the operation data in the copy mode;
    根据主机端与设备端的映射关系建立以及消除时间,计算映射模式下所述操作数据的数据传输时间;Calculating and eliminating the time according to the mapping relationship between the host end and the device end, and calculating the data transmission time of the operation data in the mapping mode;
    根据设备程序执行过程中,对所述操作数据的内存访问总数据量以及访问主机端的内存访问速率,计算所述映射模式下所述操作数据的设备程序执行时间;Calculating a device program execution time of the operation data in the mapping mode according to a total amount of memory accesses to the operation data and a memory access rate of accessing the host end during execution of the device program;
    将所述映射模式下计算的数据传输时间以及设备程序执行之和,作为所述操作数据在所述映射模式下的执行消耗时间。The data transfer time calculated in the mapping mode and the sum of device program executions are consumed as execution time of the operation data in the mapping mode.
  5. 根据权利要求4所述的方法,其特征在于,所述对所述操作数据的内存访问总数据量为根据源程序文件中定义的设备程序的工作项数量以及单位工作项的内存访问数据量计算得到。The method according to claim 4, wherein the total amount of memory accesses to the operation data is calculated according to the number of work items of the device program defined in the source program file and the amount of memory access data of the unit work item. get.
  6. 根据权利要求4所述的方法,其特征在于,所述数据传输速率、所述访问设备端的内存访问速率或者所述访问主机端的内存访问速率是根据当前异构系统执行硬件平台的硬件特征预先确定的。The method according to claim 4, wherein the data transmission rate, the memory access rate of the access device, or the memory access rate of the access host is predetermined according to hardware features of the current heterogeneous system execution hardware platform. of.
  7. 一种编译器,其特征在于,包括:A compiler, comprising:
    模式确定模块,用于获取OpenCL程序的源程序文件,并确定所述源程序文件中定义的操作数据的第一数据传输模式;a mode determining module, configured to acquire a source program file of the OpenCL program, and determine a first data transmission mode of the operation data defined in the source program file;
    计算模块,用于计算所述操作数据分别在所述第一数据传输模式和第二数据传输模式下的执行消耗时间,所述第二数据传输模式与所述第一数据传 输模式不同,所述执行消耗时间包括所述操作数据的数据传输时间和设备程序执行时间;a calculation module, configured to calculate an execution consumption time of the operation data in the first data transmission mode and the second data transmission mode, respectively, the second data transmission mode and the first data transmission The transmission mode is different, and the execution consumption time includes a data transmission time of the operation data and a device program execution time;
    模式选择模块,用于选择消耗时间较小的数据传输模式作为所述源程序文件编译时所述操作数据的编译数据传输模式。And a mode selection module, configured to select a data transmission mode that consumes less time as a compiled data transmission mode of the operation data when the source program file is compiled.
    编译模块,用于按照所述编译数据传输模式生成编译执行代码文件。A compiling module is configured to generate a compiled execution code file according to the compiled data transfer mode.
  8. 根据权利要求7所述的编译器,其特征在于,还包括:The compiler according to claim 7, further comprising:
    验证模块,用于验证所述操作数据按照第二数据传输模式处理时,所述操作数据是否安全,若是,再触发所述计算模块。And a verification module, configured to verify whether the operation data is safe when the operation data is processed according to the second data transmission mode, and if so, trigger the calculation module.
  9. 根据权利要求8所述的编译器,其特征在于,所述验证模块具体用于当所述第一数据传输模式为复制模式,所述第二数据传输模式为映射模式,分析在程序执行过程中,是否存在主机端对所述操作数据的写操作,若否,确定所述操作数据安全;或者,当所述第一数据模式为映射模式,所述第二数据传输模式为复制模式时,分析在程序执行过程中,是否存在设备端对所述操作数据的写操作,若否,确定所述数据安全。The compiler according to claim 8, wherein the verification module is specifically configured to: when the first data transmission mode is a copy mode, and the second data transmission mode is a mapping mode, and the analyzing is performed during program execution. Whether there is a write operation of the operation data by the host side, if not, determining that the operation data is secure; or, when the first data mode is a mapping mode and the second data transmission mode is a copy mode, analysis During the execution of the program, whether there is a write operation of the operation data by the device side, and if not, the data is determined to be safe.
  10. 根据权利要求7~9任一项所述的编译器,其特征在于,所述第一数据传输模式为复制模式,所述第二数据传输模式为映射模式时;或,所述第一数据传输模式为映射模式,所述第二数据传输模式为复制模式;The compiler according to any one of claims 7 to 9, wherein the first data transmission mode is a copy mode, and the second data transmission mode is a mapping mode; or, the first data transmission The mode is a mapping mode, and the second data transmission mode is a copy mode;
    所述计算模块包括:The calculation module includes:
    第一传输时间计算模块,用于根据所述操作数据的总数据量以及数据传输速率,计算复制模式下所述操作数据的数据传输时间;a first transmission time calculation module, configured to calculate a data transmission time of the operation data in the replication mode according to the total data volume of the operation data and the data transmission rate;
    第一执行时间计算模块,用于根据设备程序执行过程中,对所述操作数据的内存访问总数据量以及访问设备端的内存访问速率,计算复制模式下所述操作数据的设备程序执行时间;a first execution time calculation module, configured to calculate a device program execution time of the operation data in the copy mode according to a total amount of memory accesses to the operation data and a memory access rate of the access device end during execution of the device program;
    第一消耗时间计算模块,用于将将所述复制模式下计算的数据传输时间以及设备程序执行时间之和,作为所述操作数据在复制模式下的执行消耗时间;a first consumption time calculation module, configured to use a sum of a data transmission time calculated in the copy mode and a device program execution time as an execution consumption time of the operation data in the copy mode;
    第二传输时间计算模块,用于根据主机端与设备端的映射关系建立以及消除时间,计算映射模式下所述操作数据的数据传输时间; a second transmission time calculation module, configured to calculate and eliminate time according to a mapping relationship between the host end and the device end, and calculate a data transmission time of the operation data in the mapping mode;
    第二执行时间计算模块,用于根据设备程序执行过程中,对所述操作数据的内存访问总数据量以及访问主机端的内存访问速率,计算映射模式下所述操作数据的设备程序执行时间;a second execution time calculation module, configured to calculate a device program execution time of the operation data in the mapping mode according to a total amount of memory accesses to the operation data and a memory access rate of the access host during execution of the device program;
    第二消耗时间计算模块,用于将所述映射模式下计算的数据传输时间以及设备程序执行之和,作为所述操作数据在映射模式下的执行消耗时间。 And a second consumption time calculation module, configured to use the data transmission time calculated in the mapping mode and the sum of device program executions as the execution time of the operation data in the mapping mode.
PCT/CN2014/085885 2013-09-06 2014-09-04 Opencl program compilation method and compiler WO2015032331A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310404125.6 2013-09-06
CN201310404125.6A CN104424009B (en) 2013-09-06 2013-09-06 OpenCL program compiling methods and compiler

Publications (1)

Publication Number Publication Date
WO2015032331A1 true WO2015032331A1 (en) 2015-03-12

Family

ID=52627814

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/085885 WO2015032331A1 (en) 2013-09-06 2014-09-04 Opencl program compilation method and compiler

Country Status (2)

Country Link
CN (1) CN104424009B (en)
WO (1) WO2015032331A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017173662A1 (en) * 2016-04-08 2017-10-12 华为技术有限公司 Heterogeneous system based program processing method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6298477B1 (en) * 1998-10-30 2001-10-02 Sun Microsystems, Inc. Method and apparatus for selecting ways to compile at runtime
CN1518693A (en) * 2000-10-05 2004-08-04 皇家菲利浦电子有限公司 Retargetable compiling system and method
CN101034361A (en) * 2007-01-18 2007-09-12 浙江大学 Method for generating compiler optimized code based on instruction cost

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6298477B1 (en) * 1998-10-30 2001-10-02 Sun Microsystems, Inc. Method and apparatus for selecting ways to compile at runtime
CN1518693A (en) * 2000-10-05 2004-08-04 皇家菲利浦电子有限公司 Retargetable compiling system and method
CN101034361A (en) * 2007-01-18 2007-09-12 浙江大学 Method for generating compiler optimized code based on instruction cost

Also Published As

Publication number Publication date
CN104424009A (en) 2015-03-18
CN104424009B (en) 2017-10-17

Similar Documents

Publication Publication Date Title
Lowe-Power et al. The gem5 simulator: Version 20.0+
Ubal et al. Multi2Sim: A simulation framework for CPU-GPU computing
Lo et al. Roofline model toolkit: A practical tool for architectural and program analysis
JP5551939B2 (en) Method, computer-readable medium, and system for generating parallel SIMD code for any target architecture
Konstantinidis et al. A quantitative roofline model for GPU kernel performance estimation using micro-benchmarks and hardware metric profiling
Gutierrez et al. Sources of error in full-system simulation
US8104030B2 (en) Mechanism to restrict parallelization of loops
US10175964B2 (en) Compiler caching for runtime routine redundancy tracking
Potop-Butucaru et al. Integrated worst-case execution time estimation of multicore applications
JP2013528884A (en) Dynamic loading of graph-based calculations
Jiang et al. WebPerf: Evaluating what-if scenarios for cloud-hosted web applications
US10318261B2 (en) Execution of complex recursive algorithms
WO2017015071A1 (en) Incremental interprocedural dataflow analysis during compilation
US20160078531A1 (en) Aggregation engine for real-time counterparty credit risk scoring
Yang Hierarchical roofline analysis: How to collect data using performance tools on intel cpus and nvidia gpus
Pathak et al. Enabling automatic offloading of resource-intensive smartphone applications
Bertoni et al. Performance portability evaluation of opencl benchmarks across intel and nvidia platforms
Owenson et al. An unstructured CFD mini‐application for the performance prediction of a production CFD code
Qiu et al. Clara: Performance clarity for SmartNIC offloading
Liu et al. Mousse: a system for selective symbolic execution of programs with untamed environments
WO2015032331A1 (en) Opencl program compilation method and compiler
US20160110170A1 (en) Message inlining
DeRose et al. Relative debugging for a highly parallel hybrid computer system
US20140372996A1 (en) Compiler optimization for memoization of pure function arguments
Papadopoulos et al. Customization methodology for implementation of streaming aggregation in embedded systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14842096

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14842096

Country of ref document: EP

Kind code of ref document: A1