CN111580848A - WRF mode-based GPU migration method, device, equipment and storage medium - Google Patents

WRF mode-based GPU migration method, device, equipment and storage medium Download PDF

Info

Publication number
CN111580848A
CN111580848A CN202010564168.0A CN202010564168A CN111580848A CN 111580848 A CN111580848 A CN 111580848A CN 202010564168 A CN202010564168 A CN 202010564168A CN 111580848 A CN111580848 A CN 111580848A
Authority
CN
China
Prior art keywords
version
gpu
execution
cpu
input parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010564168.0A
Other languages
Chinese (zh)
Inventor
周康明
李震坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Eye Control Technology Co Ltd
Original Assignee
Shanghai Eye Control Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Eye Control Technology Co Ltd filed Critical Shanghai Eye Control Technology Co Ltd
Priority to CN202010564168.0A priority Critical patent/CN111580848A/en
Publication of CN111580848A publication Critical patent/CN111580848A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/71Version control; Configuration management

Abstract

The application relates to a GPU transplantation method, a device, equipment and a storage medium based on a WRF mode. The method comprises the following steps: acquiring input parameters required by a target function; the target function is a function in the WRF mode execution program and is used for realizing the meteorological data processing process in the WRF mode; executing the CPU version and the GPU version corresponding to the target function according to the input parameters to obtain an execution result of the CPU version and an execution result of the GPU version; comparing the execution result of the CPU version with the execution result of the GPU version, and storing the GPU version when the comparison result meets a preset condition to finish the GPU transplantation process of the WRF mode; the preset condition is used for representing the execution performance of the GPU version. Therefore, the transplantation process of the GPU version is efficiently completed by comparing the execution results of the CPU version and the GPU version, and when the weather data processing process of the WRF mode is executed by the GPU version subsequently, the corresponding execution performance can be greatly improved, and a good data basis is provided for the calculation of the current WRF mode.

Description

WRF mode-based GPU migration method, device, equipment and storage medium
Technical Field
The application relates to the technical field of meteorological science, in particular to a WRF mode-based GPU transplanting method, device, equipment and storage medium.
Background
With the development of meteorological science, numerical forecasting (numerical weather prediction) has become a core technology for improving the meteorological forecasting capability, and a numerical model is an indispensable tool for developing numerical forecasting. The WRF (the weather research and weather model) mode is a numerical mode commonly used in weather forecast, the micro-physical module WSM6 (WRF-Single-movement 6-Class) in the model is a highly compute-intensive module, and the computing process of the current WSM6 module is mostly executed based on a CPU architecture.
Nowadays, WRF mode is developing towards higher resolution, more complex physical process, ensemble prediction, and multi-mode coupling, and the demands on computer processing capability are also increasing. Thus, existing CPU architectures have been unable to meet the computational needs of the WRF mode.
Disclosure of Invention
Based on this, it is necessary to provide a method, an apparatus, a device and a storage medium for GPU migration based on a WRF mode, in order to solve the problem in the conventional technology that the existing CPU architecture cannot meet the computational requirements of the WRF mode.
A GPU migration method based on a WRF mode comprises the following steps:
acquiring input parameters required by a target function; the target function is a function in the WRF mode execution program and is used for realizing the meteorological data processing process in the WRF mode;
executing the CPU version and the GPU version corresponding to the target function according to the input parameters to obtain an execution result of the CPU version and an execution result of the GPU version;
comparing the execution result of the CPU version with the execution result of the GPU version, and storing the GPU version when the comparison result meets a preset condition to finish the GPU transplantation process of the WRF mode; the preset condition is used for representing the execution performance of the GPU version.
In one embodiment, the objective function includes the WSM6 function; before obtaining the input parameters required by the objective function, the method further includes:
and executing the preprocessing process of the meteorological data in the WRF mode execution program to obtain input parameters required by the WSM6 function, and storing the input parameters into a storage file.
In one embodiment, obtaining input parameters required by the objective function includes:
and acquiring input parameters required by the target function from the storage file by executing the main transplanting program.
In one embodiment, before executing the CPU version and the GPU version corresponding to the objective function according to the input parameter, the method further includes:
adopting a preset GPU version rule to rewrite the CPU version corresponding to the target function to obtain a GPU version; the GPU version rule comprises a CUDA rule and/or an OpenACC rule, and the calling interface of the GPU version is the same as the calling interface of the CPU version.
In one embodiment, executing the CPU version and the GPU version corresponding to the objective function according to the input parameter to obtain an execution result of the CPU version and an execution result of the GPU version includes:
executing the transplanting main program, transmitting the input parameters into the CPU version corresponding to the target function through a calling interface, and executing the CPU version to obtain a corresponding execution result;
and executing the transplanting main program, transmitting the input parameters into the GPU version corresponding to the target function through the calling interface, and executing the GPU version to obtain a corresponding execution result.
In one embodiment, the execution result includes an execution time and a weather prediction result; comparing the execution result of the CPU version with the execution result of the GPU version, and storing the GPU version when the comparison result meets a preset condition, wherein the method comprises the following steps:
comparing the execution time of the GPU version with the execution time of the CPU version to determine the acceleration multiple of the GPU version;
comparing the weather prediction result of the GPU version with the weather prediction result of the CPU version to determine the result error of the GPU version;
and when the acceleration multiple of the GPU version is greater than or equal to a preset multiple threshold value and the result error of the GPU version is less than or equal to a preset error threshold value, saving the GPU version to a storage file.
In one embodiment, the method further includes:
if the acceleration multiple of the GPU version is smaller than the multiple threshold value or the result error of the GPU version is larger than the error threshold value, updating the GPU version; and returning to the step of executing the GPU version corresponding to the target function according to the input parameters after updating.
A WRF mode-based GPU migration apparatus, the apparatus comprising:
the acquisition module is used for acquiring input parameters required by the target function; the target function is a function in the WRF mode execution program and is used for realizing the data processing process of the WRF mode;
the execution module is used for executing the CPU version and the GPU version corresponding to the target function according to the input parameters to obtain an execution result of the CPU version and an execution result of the GPU version;
the comparison module is used for comparing the execution result of the CPU version with the execution result of the GPU version, and storing the GPU version when the comparison result meets the preset condition so as to finish the GPU transplantation process of the WRF mode; the preset condition is used for representing the execution performance of the GPU version.
A computer device comprising a memory and a processor, the memory storing a computer program that when executed by the processor performs the steps of:
acquiring input parameters required by a target function; the target function is a function in the WRF mode execution program and is used for realizing the meteorological data processing process in the WRF mode;
executing the CPU version and the GPU version corresponding to the target function according to the input parameters to obtain an execution result of the CPU version and an execution result of the GPU version;
comparing the execution result of the CPU version with the execution result of the GPU version, and storing the GPU version when the comparison result meets a preset condition to finish the GPU transplantation process of the WRF mode; the preset condition is used for representing the execution performance of the GPU version.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring input parameters required by a target function; the target function is a function in the WRF mode execution program and is used for realizing the meteorological data processing process in the WRF mode;
executing the CPU version and the GPU version corresponding to the target function according to the input parameters to obtain an execution result of the CPU version and an execution result of the GPU version;
comparing the execution result of the CPU version with the execution result of the GPU version, and storing the GPU version when the comparison result meets a preset condition to finish the GPU transplantation process of the WRF mode; the preset condition is used for representing the execution performance of the GPU version.
The GPU transplanting method, the device, the computer equipment and the readable storage medium based on the WRF mode can acquire input parameters required by an objective function; the target function is a function in the WRF mode execution program and is used for realizing the meteorological data processing process in the WRF mode; executing the CPU version and the GPU version corresponding to the target function according to the input parameters to obtain an execution result of the CPU version and an execution result of the GPU version; comparing the execution result of the CPU version with the execution result of the GPU version, and storing the GPU version when the comparison result meets a preset condition to finish the GPU transplantation process of the WRF mode; the preset condition is used for representing the execution performance of the GPU version. Therefore, the transplantation process of the GPU version is efficiently completed by comparing the execution results of the CPU version and the GPU version, and when the weather data processing process of the WRF mode is executed by the GPU version subsequently, the corresponding execution performance can be greatly improved, and a good data basis is provided for the calculation of the current WRF mode.
Drawings
FIG. 1 is a diagram illustrating an internal structure of a computer device according to an embodiment;
FIG. 2 is a flow diagram illustrating a WRF mode-based GPU migration method in one embodiment;
FIG. 3 is a flowchart illustrating a WRF mode-based GPU migration method in another embodiment;
FIG. 4 is a flowchart illustrating a WRF mode-based GPU migration method in yet another embodiment;
FIG. 5 is a block diagram of a WRF mode-based GPU migration device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The GPU transplantation method based on the WRF mode, which is provided by the embodiment of the application, can be applied to computer equipment shown in figure 1. The computer device comprises a processor and a memory connected by a system bus, wherein a computer program is stored in the memory, and the steps of the method embodiments described below can be executed when the processor executes the computer program. Optionally, the computer device may further comprise a communication interface, a display screen and an input means. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a nonvolatile storage medium storing an operating system and a computer program, and an internal memory. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for connecting and communicating with an external terminal through a network. Optionally, the computer device may be a Personal Computer (PC), a personal digital assistant, other terminal devices such as a tablet computer (PAD), a mobile phone, and the like, and may also be a cloud or a remote server, where a specific form of the computer device is not limited in this embodiment of the application.
In an embodiment, as shown in fig. 2, a method for GPU migration based on a WRF mode is provided, and this embodiment relates to a specific process of migrating a GPU version based on execution results of a CPU version and a GPU version in the WRF mode. Taking the example that the method is applied to the computer device in fig. 1 as an example, the method comprises the following steps:
s101, acquiring input parameters required by a target function; the target function is a function in the WRF mode execution program and is used for realizing the meteorological data processing process in the WRF mode.
The numerical forecasting mode is a method for predicting the atmospheric motion state and the weather phenomenon by using a large computer to perform numerical calculation under certain initial value and side value conditions (such as surface air pressure, surface humidity and other meteorological values observed by a meteorological station), solving a hydromechanics and thermodynamics equation set describing the weather evolution process; while the WRF mode is a numerical forecasting mode commonly used therein, it can be seen that the computation process of the WRF mode is very complex, and the WRF mode includes a plurality of computation modules, of which the micro physical module WSM6 is a highly computation-intensive module.
Generally, the WSM6 module corresponds to the WSM6 function in the WRF mode code, which executes the WSM module by calling the WSM6 function in the module _ microphysics _ driver. Optionally, the objective function may be a WSM6 function, and may also be a function corresponding to another module in the WRF mode, and may be used to implement the weather data processing process in the WRF mode. In order to transplant the code of the WSM6 function to the GPU and facilitate subsequent debugging, the WSM6 function is extracted from the whole module _ microprocessor _ driver.f program and executed separately, so that the input parameters of the function need to be determined when the WSM6 function is called. Alternatively, the computer device may obtain the input parameters to be input to the WSM6 function before calling the function during execution of the module _ microphysics _ driver.f program.
And S102, executing the CPU version and the GPU version corresponding to the target function according to the input parameters to obtain an execution result of the CPU version and an execution result of the GPU version.
Specifically, in this embodiment, the original code corresponding to the WSM6 function is a CPU version, the computer device may rewrite the CPU version to a GPU version according to the relevant rule of the GPU version, and then input the input parameters into the input interfaces of the CPU version and the GPU version respectively to execute the corresponding CPU version and GPU version respectively, so as to obtain the execution result of the CPU version and the execution result of the GPU version.
S103, comparing the execution result of the CPU version with the execution result of the GPU version, and storing the GPU version when the comparison result meets a preset condition to finish the GPU transplantation process of the WRF mode; the preset condition is used for representing the execution performance of the GPU version.
Specifically, the computer device compares the obtained execution result of the CPU version with the execution result of the GPU version; optionally, the computation time of the two versions of the GPU may be compared with the computation time of the meteorological data, and if the computation time of the GPU version is significantly shorter than the computation time of the CPU version, or shorter than a preset multiple (e.g., 3 times) of the computation time of the CPU version, the comparison result is considered to satisfy a preset condition, which indicates that the execution performance of the rewritten GPU version is better, and the complex meteorological data can be speeded up, thereby satisfying the needs of a practical scene. Then, the computer device may store the GPU version at this time, and complete the GPU migration process of the WSM6 function in the WRF mode. The GPU migration process for other functions is similar to the migration process for the WSM6 function, and is not described herein again.
Optionally, if the comparison result does not satisfy the preset condition, the computer device may further perform optimization updating on the GPU version, reduce redundancy of the execution step, and the like, and then input the updated GPU version with the input parameter, and perform judgment again until the preset condition is satisfied.
In the GPU transplantation method based on the WRF mode provided in this embodiment, the computer device first obtains input parameters required by an objective function for implementing a meteorological data processing process in the WRF mode, then executes a CPU version and a GPU version corresponding to the objective function according to the input parameters, obtains an execution result of the CPU version and an execution result of the GPU version, and compares the two execution results; and when the comparison result meets the preset condition, storing the GPU version, and representing that the GPU transplantation process of the WRF mode is completed. Therefore, the transplantation process of the GPU version is efficiently completed by comparing the execution results of the CPU version and the GPU version, and when the weather data processing process of the WRF mode is executed by the GPU version subsequently, the corresponding execution performance can be greatly improved, and a good data basis is provided for the calculation of the current WRF mode.
Optionally, in an embodiment, before obtaining the input parameters required by the objective function, the computer device may further perform a preprocessing process of the meteorological data in the WRF mode execution program, that is, a program process before the WSM6 function, such as noise reduction filtering, normalization, or other processing processes on the meteorological data, to obtain the input parameters required by the WSM6 function; and add the relevant version before calling the WSM6 function so that the input parameters are saved to a storage file, such as a disk file or the like. Therefore, when the CPU version or the GPU version is executed subsequently, the input parameters can be directly acquired from the storage file, so that the execution efficiency is improved.
Alternatively, in one embodiment, after the input parameters required by the objective function are saved in the storage file, the computer device may obtain the input parameters required by the objective function from the storage file by executing the migration main program. The main program is transplanted with the functions of reading the storage file to obtain the input parameters and calling the CPU version and the GPU version corresponding to the execution target function.
Optionally, in an embodiment, the computer device may further rewrite the CPU version to obtain a corresponding GPU version, and then the method further includes: adopting a preset GPU version rule to rewrite the CPU version corresponding to the target function to obtain a GPU version; the GPU version rule comprises a CUDA rule and/or an OpenACC rule, and the calling interface of the GPU version is the same as the calling interface of the CPU version.
Among them, the Unified computing Device Architecture (CUDA) is a general-purpose parallel computing Architecture introduced by NVIDIA, which enables a GPU to solve complex computing problems, and includes the CUDA Instruction Set Architecture (ISA) and a parallel computing engine inside the GPU. OpenACC is a programming rule for implementing GPU parallel computing. The computer equipment can rewrite the CPU version corresponding to the target function according to the two version rules to obtain a GPU version; and the calling interface of the rewritten GPU version is the same as that of the CPU version, so that the CPU version and the GPU version can be respectively called by a transplanting main program, and subsequent compiling and debugging processes are facilitated. On the basis, optionally, the computer device may execute a migration main program, transmit the input parameter to the CPU version corresponding to the target function through the call interface, and execute the CPU version to obtain a corresponding execution result; and executing the transplanting main program, simultaneously transmitting the input parameters into the GPU version corresponding to the target function through the calling interface, and executing the GPU version to obtain a corresponding execution result.
In an embodiment, the execution result includes an execution time and a weather prediction result, and the embodiment relates to a specific process of comparing the execution result of the CPU version with the execution result of the GPU version and storing the GPU version. Alternatively, as shown in fig. 3, the S103 may include:
s201, comparing the execution time of the GPU version with the execution time of the CPU version, and determining the acceleration multiple of the GPU version.
Specifically, the computer device may calculate a ratio of the CPU version execution time to the GPU version execution time, and use the obtained ratio as an acceleration multiple of the GPU version. For example, if the execution time of the CPU version is 30s and the execution time of the GPU version is 10s, the acceleration multiple of the GPU version is 3 (30/10).
S202, comparing the weather prediction result of the GPU version with the weather prediction result of the CPU version, and determining the result error of the GPU version.
Specifically, the computer device may compare the weather prediction result of the GPU version and the weather prediction result of the CPU version in the same dimension, for example, compare the prediction results in the dimensions of wind speed, humidity, temperature, and the like, and use the comparison result as a result error of the GPU version. For example, if the weather prediction result of the GPU version is predicted temperature of 20 degrees, and the weather prediction result of the CPU version is predicted temperature of 25 degrees, the result error of the GPU version in the temperature dimension is 5 degrees.
S203, when the acceleration multiple of the GPU version is larger than or equal to a preset multiple threshold value and the result error of the GPU version is smaller than or equal to a preset error threshold value, saving the GPU version to a storage file.
Specifically, when the acceleration multiple of the GPU version is greater than or equal to the preset multiple threshold and the result error is less than or equal to the preset error threshold, the execution performance representing the GPU version meets the requirement, and then the computer device may save the GPU version to the storage file. For example, assuming that the preset multiple threshold is 3 times and the error threshold in the temperature dimension is 2 degrees, for the above example, the acceleration multiple of the GPU version is 3 times and satisfies the condition, and the result error is 5 degrees and does not satisfy the condition, which means that the execution performance of the rewritten GPU version does not satisfy the requirement; and if the acceleration multiple of the GPU version is 3 times and the error of the result is 2 degrees, representing that the execution performance of the rewritten GPU version meets the requirement.
Optionally, if the acceleration multiple of the GPU version is smaller than the multiple threshold, or the result error of the GPU version is larger than the error threshold, the computer device may perform optimization updating on the GPU version according to the GPU version rule, and return to the step of executing the GPU version corresponding to the target function according to the input parameter after updating, that is, compare the execution result of the GPU version with the execution result of the CPU version again until the comparison result meets the preset condition.
In the GPU migration method based on the WRF mode provided in this embodiment, the computer device compares the execution time and the weather prediction result of the GPU version with the execution time and the weather prediction result of the CPU version, respectively, to obtain the acceleration multiple and the result error of the GPU version, and stores the GPU version in the storage file based on the acceleration multiple and the multiple threshold, and the result error and the error threshold. Therefore, the execution performance of the GPU version is evaluated through the execution time and the prediction result, the acceleration can be realized on the basis of ensuring the accuracy, and the calculation performance of the WRF mode is greatly improved.
To better understand the whole flow of the WRF mode-based GPU migration method, the method is described again in the following by way of an overall embodiment, as shown in fig. 4, and includes:
s301, executing a preprocessing process of meteorological data in a WRF mode execution program to obtain input parameters required by a WSM6 function, and storing the input parameters into a storage file;
s302, acquiring input parameters required by the target function from the storage file by executing the transplanting main program;
s303, rewriting the CPU version corresponding to the target function by adopting a preset GPU version rule to obtain a GPU version;
s304, executing the CPU version and the GPU version corresponding to the target function according to the input parameters to obtain an execution result of the CPU version and an execution result of the GPU version;
s305, comparing the execution result of the CPU version with the execution result of the GPU version;
s306, when the comparison result meets a preset condition, storing the GPU version to complete the GPU transplantation process of the WRF mode;
s307, when the comparison result does not meet the preset condition, updating the GPU version; and returning to the step of executing the GPU version corresponding to the target function according to the input parameters after updating.
For the implementation method of each step, reference may be made to the description of the above embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.
It should be understood that although the various steps in the flowcharts of fig. 2-4 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-4 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least some of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 5, there is provided a GPU migration apparatus based on a WRF mode, including: an acquisition module 11, an execution module 12 and a comparison module 13.
Specifically, the obtaining module 11 is configured to obtain an input parameter required by the objective function; the target function is a function in the WRF mode execution program and is used for realizing the data processing process of the WRF mode;
the execution module 12 is configured to execute the CPU version and the GPU version corresponding to the target function according to the input parameter, and obtain an execution result of the CPU version and an execution result of the GPU version;
the comparison module 13 is configured to compare the execution result of the CPU version with the execution result of the GPU version, and when the comparison result meets a preset condition, store the GPU version to complete the GPU transplantation process in the WRF mode; the preset condition is used for representing the execution performance of the GPU version.
The WRF mode-based GPU migration apparatus provided in this embodiment may implement the method embodiments described above, and the implementation principle and the technical effect are similar, which are not described herein again.
In one embodiment, the objective function includes the WSM6 function; the execution module 12 is further configured to execute a preprocessing process of the meteorological data in the WRF mode execution program, obtain input parameters required by the WSM6 function, and store the input parameters in a storage file.
In an embodiment, the obtaining module 11 is specifically configured to obtain the input parameters required by the target function from the storage file by executing the migration main program.
In one embodiment, the apparatus further includes a rewriting module, configured to rewrite the CPU version corresponding to the target function by using a preset GPU version rule, to obtain a GPU version; the GPU version rule comprises a CUDA rule and/or an OpenACC rule, and the calling interface of the GPU version is the same as the calling interface of the CPU version.
In one embodiment, the execution module 12 is specifically configured to execute the migration main program, transfer the input parameter into the CPU version corresponding to the target function through the call interface, and execute the CPU version to obtain a corresponding execution result; and executing the transplanting main program, transmitting the input parameters into the GPU version corresponding to the target function through the calling interface, and executing the GPU version to obtain a corresponding execution result.
In one embodiment, the execution result includes an execution time and a weather prediction result; a comparison module 13, configured to compare the execution time of the GPU version with the execution time of the CPU version, and determine an acceleration multiple of the GPU version; comparing the weather prediction result of the GPU version with the weather prediction result of the CPU version to determine the result error of the GPU version; and when the acceleration multiple of the GPU version is greater than or equal to a preset multiple threshold value and the result error of the GPU version is less than or equal to a preset error threshold value, saving the GPU version to a storage file.
In one embodiment, the rewriting module is further configured to update the GPU version if the acceleration multiple of the GPU version is smaller than a multiple threshold, or the result error of the GPU version is larger than an error threshold; and returning to the step of executing the GPU version corresponding to the target function according to the input parameters after updating.
For specific limitations of the GPU migration apparatus based on the WRF mode, reference may be made to the above limitations of the GPU migration method based on the WRF mode, and details are not described here. The various modules in the WRF mode-based GPU migration apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 1. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a WRF mode-based GPU migration method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 1 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
acquiring input parameters required by a target function; the target function is a function in the WRF mode execution program and is used for realizing the meteorological data processing process in the WRF mode;
executing the CPU version and the GPU version corresponding to the target function according to the input parameters to obtain an execution result of the CPU version and an execution result of the GPU version;
comparing the execution result of the CPU version with the execution result of the GPU version, and storing the GPU version when the comparison result meets a preset condition to finish the GPU transplantation process of the WRF mode; the preset condition is used for representing the execution performance of the GPU version.
The implementation principle and technical effect of the computer device provided in this embodiment are similar to those of the method embodiments described above, and are not described herein again.
In one embodiment, the objective function includes the WSM6 function; the processor, when executing the computer program, further performs the steps of:
and executing the preprocessing process of the meteorological data in the WRF mode execution program to obtain input parameters required by the WSM6 function, and storing the input parameters into a storage file.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and acquiring input parameters required by the target function from the storage file by executing the main transplanting program.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
adopting a preset GPU version rule to rewrite the CPU version corresponding to the target function to obtain a GPU version; the GPU version rule comprises a CUDA rule and/or an OpenACC rule, and the calling interface of the GPU version is the same as the calling interface of the CPU version.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
executing the transplanting main program, transmitting the input parameters into the CPU version corresponding to the target function through a calling interface, and executing the CPU version to obtain a corresponding execution result;
and executing the transplanting main program, transmitting the input parameters into the GPU version corresponding to the target function through the calling interface, and executing the GPU version to obtain a corresponding execution result.
In one embodiment, the execution result includes an execution time and a weather prediction result; the processor, when executing the computer program, further performs the steps of:
comparing the execution time of the GPU version with the execution time of the CPU version to determine the acceleration multiple of the GPU version;
comparing the weather prediction result of the GPU version with the weather prediction result of the CPU version to determine the result error of the GPU version;
and when the acceleration multiple of the GPU version is greater than or equal to a preset multiple threshold value and the result error of the GPU version is less than or equal to a preset error threshold value, saving the GPU version to a storage file.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
if the acceleration multiple of the GPU version is smaller than the multiple threshold value or the result error of the GPU version is larger than the error threshold value, updating the GPU version; and returning to the step of executing the GPU version corresponding to the target function according to the input parameters after updating.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
acquiring input parameters required by a target function; the target function is a function in the WRF mode execution program and is used for realizing the meteorological data processing process in the WRF mode;
executing the CPU version and the GPU version corresponding to the target function according to the input parameters to obtain an execution result of the CPU version and an execution result of the GPU version;
comparing the execution result of the CPU version with the execution result of the GPU version, and storing the GPU version when the comparison result meets a preset condition to finish the GPU transplantation process of the WRF mode; the preset condition is used for representing the execution performance of the GPU version.
The implementation principle and technical effect of the computer-readable storage medium provided by this embodiment are similar to those of the above-described method embodiment, and are not described herein again.
In one embodiment, the objective function includes the WSM6 function; the computer program when executed by the processor further realizes the steps of:
and executing the preprocessing process of the meteorological data in the WRF mode execution program to obtain input parameters required by the WSM6 function, and storing the input parameters into a storage file.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and acquiring input parameters required by the target function from the storage file by executing the main transplanting program.
In one embodiment, the computer program when executed by the processor further performs the steps of:
adopting a preset GPU version rule to rewrite the CPU version corresponding to the target function to obtain a GPU version; the GPU version rule comprises a CUDA rule and/or an OpenACC rule, and the calling interface of the GPU version is the same as the calling interface of the CPU version.
In one embodiment, the computer program when executed by the processor further performs the steps of:
executing the transplanting main program, transmitting the input parameters into the CPU version corresponding to the target function through a calling interface, and executing the CPU version to obtain a corresponding execution result;
and executing the transplanting main program, transmitting the input parameters into the GPU version corresponding to the target function through the calling interface, and executing the GPU version to obtain a corresponding execution result.
In one embodiment, the execution result includes an execution time and a weather prediction result; the computer program when executed by the processor further realizes the steps of:
comparing the execution time of the GPU version with the execution time of the CPU version to determine the acceleration multiple of the GPU version;
comparing the weather prediction result of the GPU version with the weather prediction result of the CPU version to determine the result error of the GPU version;
and when the acceleration multiple of the GPU version is greater than or equal to a preset multiple threshold value and the result error of the GPU version is less than or equal to a preset error threshold value, saving the GPU version to a storage file.
In one embodiment, the computer program when executed by the processor further performs the steps of:
if the acceleration multiple of the GPU version is smaller than the multiple threshold value or the result error of the GPU version is larger than the error threshold value, updating the GPU version; and returning to the step of executing the GPU version corresponding to the target function according to the input parameters after updating.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A GPU migration method based on a WRF mode is characterized by comprising the following steps:
acquiring input parameters required by a target function; the target function is a function in the WRF mode execution program and is used for realizing the meteorological data processing process in the WRF mode;
executing the CPU version and the GPU version corresponding to the target function according to the input parameters to obtain an execution result of the CPU version and an execution result of the GPU version;
comparing the execution result of the CPU version with the execution result of the GPU version, and storing the GPU version when the comparison result meets a preset condition to finish the GPU transplantation process of a WRF mode; the preset condition is used for representing the execution performance of the GPU version.
2. The method of claim 1, wherein the objective function comprises a WSM6 function; before the obtaining the input parameters required by the objective function, the method further comprises:
and executing the preprocessing process of meteorological data in the WRF mode execution program to obtain input parameters required by the WSM6 function, and storing the input parameters into a storage file.
3. The method of claim 2, wherein the obtaining input parameters required by the objective function comprises:
and acquiring input parameters required by the target function from the storage file by executing the main transplanting program.
4. The method of claim 1, wherein prior to the executing the CPU version and the GPU version corresponding to the objective function according to the input parameters, the method further comprises:
adopting a preset GPU version rule to rewrite the CPU version corresponding to the target function to obtain the GPU version; the GPU version rule comprises a CUDA rule and/or an OpenACC rule, and the calling interface of the GPU version is the same as the calling interface of the CPU version.
5. The method according to claim 4, wherein the executing the CPU version and the GPU version corresponding to the objective function according to the input parameters to obtain an execution result of the CPU version and an execution result of the GPU version comprises:
executing a main transplanting program, transmitting the input parameters into a CPU version corresponding to the target function through a calling interface, and executing the CPU version to obtain a corresponding execution result;
and executing the transplanting main program, transmitting the input parameters into a GPU version corresponding to the target function through a calling interface, and executing the GPU version to obtain a corresponding execution result.
6. The method of claim 1, wherein the execution results include execution time and weather prediction results; the comparing the execution result of the CPU version with the execution result of the GPU version, and storing the GPU version when the comparison result satisfies a preset condition includes:
comparing the execution time of the GPU version with the execution time of the CPU version to determine the acceleration multiple of the GPU version;
comparing the weather prediction result of the GPU version with the weather prediction result of the CPU version to determine a result error of the GPU version;
and when the acceleration multiple of the GPU version is larger than or equal to a preset multiple threshold value and the result error of the GPU version is smaller than or equal to a preset error threshold value, saving the GPU version to a storage file.
7. The method of claim 6, further comprising:
if the acceleration multiple of the GPU version is smaller than the multiple threshold value or the result error of the GPU version is larger than the error threshold value, updating the GPU version; and returning to the step of executing the GPU version corresponding to the target function according to the input parameters after updating.
8. A WRF mode-based GPU migration apparatus, the apparatus comprising:
the acquisition module is used for acquiring input parameters required by the target function; the target function is a function in a WRF mode execution program and is used for realizing a data processing process of the WRF mode;
the execution module is used for executing the CPU version and the GPU version corresponding to the target function according to the input parameters to obtain an execution result of the CPU version and an execution result of the GPU version;
the comparison module is used for comparing the execution result of the CPU version with the execution result of the GPU version, and storing the GPU version when the comparison result meets a preset condition so as to finish the GPU transplantation process of a WRF mode; the preset condition is used for representing the execution performance of the GPU version.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method according to any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202010564168.0A 2020-06-19 2020-06-19 WRF mode-based GPU migration method, device, equipment and storage medium Pending CN111580848A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010564168.0A CN111580848A (en) 2020-06-19 2020-06-19 WRF mode-based GPU migration method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010564168.0A CN111580848A (en) 2020-06-19 2020-06-19 WRF mode-based GPU migration method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111580848A true CN111580848A (en) 2020-08-25

Family

ID=72123916

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010564168.0A Pending CN111580848A (en) 2020-06-19 2020-06-19 WRF mode-based GPU migration method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111580848A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799416A (en) * 2012-07-16 2012-11-28 中国人民解放军国防科学技术大学 GPU-oriented fine grit parallel application mapping method
US20190196853A1 (en) * 2017-12-21 2019-06-27 International Business Machines Corporation Runtime gpu/cpu selection
CN110852523A (en) * 2019-11-19 2020-02-28 上海眼控科技股份有限公司 Weather forecasting method, device, equipment and storage medium based on numerical mode

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799416A (en) * 2012-07-16 2012-11-28 中国人民解放军国防科学技术大学 GPU-oriented fine grit parallel application mapping method
US20190196853A1 (en) * 2017-12-21 2019-06-27 International Business Machines Corporation Runtime gpu/cpu selection
CN110852523A (en) * 2019-11-19 2020-02-28 上海眼控科技股份有限公司 Weather forecasting method, device, equipment and storage medium based on numerical mode

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MELIN HUANG等: ""Parallel GPU architecture framework for the WRF Single Moment 6-class microphysics scheme"" *
王卓薇;许先斌;赵武清;何水兵;张玉萍;: "基于GPU的GRAPES模型并行加速及性能优化" *
马婉贞: ""基于WRF的DBN风速预测与并行优化研究"" *

Similar Documents

Publication Publication Date Title
KR102428293B1 (en) Network accessible machine learning model training and hosting system
CN109086075B (en) Artificial intelligence processor and method for executing matrix multiplication vector instruction by using same
CN111695672B (en) Method for improving MAC utilization rate of AI engine
CN113095474A (en) Resource usage prediction for deep learning models
CN112766596B (en) Construction method of building energy consumption prediction model, energy consumption prediction method and device
CN111126668A (en) Spark operation time prediction method and device based on graph convolution network
WO2023051505A1 (en) Job solving method and apparatus
CN114723033B (en) Data processing method, data processing device, AI chip, electronic device and storage medium
CN112764893B (en) Data processing method and data processing system
CN113641413A (en) Target model loading and updating method and device, readable medium and electronic equipment
CN113504918A (en) Equipment tree configuration optimization method and device, computer equipment and storage medium
CN113590199A (en) Instruction scheduling method, artificial intelligence chip, computer device and storage medium
CN116467061A (en) Task execution method and device, storage medium and electronic equipment
CN116762080A (en) Neural network generation device, neural network operation device, edge device, neural network control method, and software generation program
CN113407343A (en) Service processing method, device and equipment based on resource allocation
CN104714792A (en) Multi-process shared data processing method and device
CN111580848A (en) WRF mode-based GPU migration method, device, equipment and storage medium
WO2021068249A1 (en) Method and apparatus for hardware simulation and emulation during running, and device and storage medium
CN111008311A (en) Complex network node importance evaluation method and device based on neighborhood weak connection
KR20200139909A (en) Electronic apparatus and method of performing operations thereof
CN111913812B (en) Data processing method, device, equipment and storage medium
CN114707643A (en) Model segmentation method and related equipment thereof
CN114021733A (en) Model training optimization method and device, computer equipment and storage medium
KR102613227B1 (en) Electronic device for evaluating water supply reliability in agricultural reservoir based on big data and machine learning and controlling method thereof
CN113743448B (en) Model training data acquisition method, model training method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination