CN115115048A

CN115115048A - Model conversion method, device, computer equipment and storage medium

Info

Publication number: CN115115048A
Application number: CN202210724402.0A
Authority: CN
Inventors: 杨伟光
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-06-23
Filing date: 2022-06-23
Publication date: 2022-09-27

Abstract

The embodiment of the application discloses a model conversion method, a model conversion device, computer equipment and a storage medium, and belongs to the technical field of computers. The method comprises the following steps: analyzing a first network model belonging to a first deep learning framework to obtain operator information of the first network model, wherein the operator information comprises at least one operator and a processing parameter corresponding to each operator; inputting the operator information into a model creation function in the function library in response to a model creation instruction for the operator information, the model creation function being used to create a model belonging to the second deep learning framework; and calling a model establishing function, and establishing a second network model based on the operator information. The operator information of the first network model is input into the model creating function, so that the model creating function automatically creates the second network model based on the operator information, the conversion of the network model is realized, the operation process of converting the network model is simplified, and the conversion efficiency of the network model is improved.

Description

Model conversion method, device, computer equipment and storage medium

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to a model conversion method, a model conversion device, computer equipment and a storage medium.

Background

With the rapid development of artificial intelligence technology, it is often necessary to process data by means of a network model in various fields, for example, to perform speech recognition or text recognition by using the network model. In general, deep learning frameworks include a variety of types, such as deep learning training frameworks for training models and deep learning inference frameworks for inference acceleration. In which, the network models obtained by using different deep learning frameworks have different processing efficiencies, so it is usually necessary to convert the network model of one deep learning framework into a network model belonging to another deep learning framework.

In the related art, after training of a network model belonging to one deep learning framework is completed, a network model belonging to another deep learning framework is compiled manually according to the trained network model, and manual operation is complicated, so that the efficiency of model conversion is low.

Disclosure of Invention

The embodiment of the application provides a model conversion method, a model conversion device, computer equipment and a storage medium, and the efficiency of model conversion can be improved. The technical scheme is as follows:

in one aspect, a model transformation method is provided, and the method includes:

analyzing a first network model belonging to a first deep learning framework to obtain operator information of the first network model, wherein the operator information comprises at least one operator and a processing parameter corresponding to each operator;

inputting the operator information into a model creation function in a function library in response to a model creation instruction for the operator information, the model creation function being used for creating a model belonging to a second deep learning framework;

and calling the model creating function, and creating a second network model based on the operator information.

In another aspect, there is provided a model conversion apparatus, the apparatus including:

the analysis module is used for analyzing a first network model belonging to a first deep learning framework to obtain operator information of the first network model, wherein the operator information comprises at least one operator and a processing parameter corresponding to each operator;

an input module, configured to input the operator information into a model creation function in a function library in response to a model creation instruction for the operator information, where the model creation function is used to create a model belonging to a second deep learning framework;

and the creating module is used for calling the model creating function and creating a second network model based on the operator information.

Optionally, the apparatus further comprises:

the function packaging module is used for responding to a function packaging instruction, acquiring a code carried in the function packaging instruction, wherein the code is used for establishing a model belonging to the second deep learning framework based on operator information to be input;

the function encapsulation module is further configured to encapsulate the code to obtain the model creation function.

Optionally, the model creation function includes a generation function and a target processing function, and the generation function and the target processing function belong to the second deep learning framework; the creation module includes:

the creating unit is used for calling the generating function and generating a third network model based on the operator information and the blank network model, wherein the third network model is a network model which is not subjected to acceleration processing;

and the target processing unit is used for calling the target processing function and carrying out accelerated processing on the third network model to obtain the second network model.

Optionally, the creating unit is configured to:

based on the operator information, inquiring a processing function corresponding to the at least one operator in the function library, wherein the function library comprises a plurality of processing functions belonging to the second deep learning framework, and the processing function corresponding to the operator is a processing function for realizing the same function as the operator;

and inputting the processing parameters and the processing functions corresponding to the at least one operator into the generating function, calling the generating function, and correspondingly filling the processing functions corresponding to the processing parameters into the blank network model to obtain the third network model.

Optionally, the apparatus further comprises:

the query module is used for querying at least one first processing interface corresponding to a target operator in a first interface library for the target operator belonging to the first deep learning frame, wherein the at least one first processing interface corresponding to the target operator refers to at least one first processing interface for realizing the function of the target operator, the target operator is any one operator belonging to the first deep learning frame, and the first interface library comprises a plurality of first processing interfaces belonging to the second deep learning frame;

the function encapsulation module is used for responding to an encapsulation instruction of the at least one first processing interface under the condition that the at least one first processing interface corresponding to the target operator is inquired, encapsulating the at least one first processing interface to obtain a processing function corresponding to the target operator, and storing the processing function corresponding to the target operator in the function library;

the function creating module is used for determining at least one second processing interface corresponding to the target operator in a second interface library under the condition that at least one first processing interface corresponding to the target operator is not inquired, creating a processing plug-in including the at least one second processing interface in response to a function creating instruction aiming at the at least one second processing interface, packaging the processing plug-in as a processing function corresponding to the target operator and belonging to the second deep learning framework, and storing the processing function corresponding to the target operator in the function library, wherein the second interface library includes a plurality of second processing interfaces belonging to the first deep learning framework.

Optionally, the apparatus further comprises:

the auxiliary processing module is used for responding to a checking instruction of the filled first processing function, calling the checking function, checking the first processing function to obtain a checking result, and the checking result is used for indicating whether the first processing function has errors or not;

the auxiliary processing module is used for responding to an identifier setting instruction of the filled second processing function, calling the identifier setting function, and determining the function identifier of the second processing function as the function identifier carried in the identifier setting instruction;

the auxiliary processing module is used for responding to an input marking instruction of the filled third processing function, calling the input marking function and marking the third processing function as a first processing function in the third network model;

and the auxiliary processing module is used for responding to an output marking instruction of the filled fourth processing function, calling the output marking function and marking the fourth processing function as the last processing function in the third network model.

Optionally, the apparatus further comprises:

the test module is used for responding to a test instruction of the filled fifth processing function, acquiring first test data carried by the test instruction, and inputting the first test data and the fifth processing function into a first test function in the function library;

the test module is further configured to call the first test function, and trigger the fifth processing function to process the first test data to obtain a first processing result;

the test module is configured to determine the processing performance of the fifth processing function based on the standard processing result corresponding to the first processing result and the first test data.

Optionally, the target processing unit is further configured to call the target processing function, and serialize the second network model to obtain the serialized second network model.

Optionally, the model creation function further includes an auxiliary creation function and a parameter configuration function, and the target processing unit is configured to:

in response to a creation instruction for a control function, calling the auxiliary creation function, and creating the control function and a parameter item of the control function, wherein the control function is used for calling the target processing function;

responding to a parameter configuration instruction aiming at the control function, calling the parameter configuration function, and configuring parameter values for parameter items of the control function;

and calling the control function, and triggering and calling the target processing function to carry out accelerated processing on the third network model to obtain the second network model.

Optionally, the apparatus further comprises:

the test module is used for responding to a test instruction aiming at the second network model, acquiring second test data carried by the test instruction, and inputting the second test data and the second network model into a second test function in the function library;

the test module is further configured to call the second test function, and trigger processing of the second test data by using the second network model to obtain a second processing result;

the test module is further configured to determine the processing performance of the second network model based on the second processing result and a standard processing result corresponding to the second test data.

In another aspect, a computer device is provided, the computer device comprising a processor and a memory, the memory having stored therein at least one computer program, the at least one computer program being loaded and executed by the processor to perform the operations performed by the model transformation method according to the above aspect.

In another aspect, a computer-readable storage medium is provided, in which at least one computer program is stored, the at least one computer program being loaded and executed by a processor to perform the operations performed by the model transformation method according to the above aspect.

In another aspect, a computer program product is provided, comprising a computer program that is loaded and executed by a processor to perform the operations performed by the model transformation method according to the above aspect.

According to the method, the device, the computer equipment and the storage medium provided by the embodiment of the application, the first network model belongs to the first deep learning frame, the first network model is analyzed to obtain the operator information of the first network model, then the operator information is input into the encapsulated model creating function, so that the model creating function automatically creates the second network model belonging to the second deep learning frame based on the operator information, the network model belonging to the first deep learning frame is converted into the network model belonging to the second deep learning frame, the operation flow of converting the network model is simplified, and the conversion efficiency of the network model is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic illustration of an implementation environment provided by an embodiment of the present application;

FIG. 2 is a flow chart of a model transformation method provided in an embodiment of the present application;

FIG. 3 is a flow chart of another model transformation method provided by an embodiment of the present application;

FIG. 4 is a flow chart of another model transformation method provided in an embodiment of the present application;

FIG. 5 is a flowchart of a method for encapsulating a function according to an embodiment of the present application;

FIG. 6 is a flowchart of a method for generating a function according to an embodiment of the present disclosure;

FIG. 7 is a flow chart of another model transformation method provided by an embodiment of the present application;

FIG. 8 is a system architecture diagram of a model transformation method provided in an embodiment of the present application;

fig. 9 is a schematic structural diagram of a model transformation apparatus according to an embodiment of the present application;

FIG. 10 is a schematic structural diagram of another model transformation apparatus provided in an embodiment of the present application;

fig. 11 is a schematic structural diagram of a terminal according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the embodiments of the present application clearer, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

It will be understood that the terms "first," "second," and the like as used herein may be used herein to describe various concepts, which are not limited by these terms unless otherwise specified. These terms are only used to distinguish one concept from another. For example, a first network model may be referred to as a second network model, and similarly, a second network model may be referred to as a first network model, without departing from the scope of the present application.

For example, at least one network model may be any integer number of network models greater than or equal to one, such as one network model, two network models, three network models, and the like. The plurality of network models means two or more, and for example, the plurality of network models may be any integer number of network models equal to or larger than two, such as two network models and three network models. Each refers to each of the at least one, for example, each network model refers to each of a plurality of network models, and if the plurality of network models is 3 network models, each network model refers to each of the 3 network models.

It is understood that, in the embodiments of the present application, related data such as user information, when the above embodiments of the present application are applied to specific products or technologies, user permission or consent needs to be obtained, and the collection, use and processing of related data need to comply with relevant laws and regulations and standards of relevant countries and regions.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the implementation method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and the like.

Machine Learning (ML) is a multi-domain cross subject, and relates to multi-domain subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning.

The model conversion method provided by the embodiment of the present application will be described below based on an artificial intelligence technique.

Fig. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application. Referring to fig. 1, the implementation environment includes a terminal 101 and a server 102. The terminal 101 and the server 102 may be directly or indirectly connected by wired or wireless communication.

The server 102 is configured to train a first network model belonging to a first deep learning frame, convert the first network model belonging to the first deep learning frame into a second network model belonging to a second deep learning frame, and send the second network model to the terminal 101, where the terminal 101 is configured to call the second network model for processing.

In one possible implementation, the terminal 101 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, or a vehicle-mounted terminal. The server 102 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a web service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform.

In one possible implementation manner, the terminal 101 has a target application installed thereon for invoking the network model, and the terminal 101 can invoke the network model for processing through the target application. Optionally, the target application is a target application in an operating system of the terminal 101, or a target application provided by a third party, and the like.

The model conversion method provided by the embodiment of the application can be applied to any scene needing to convert the deep learning framework of the model.

Taking a network model as a voice recognition model as an example, firstly, a first voice recognition model belonging to PyTorch is compiled by using a deep learning framework (CUDA) Extension technology for training a model, then, by adopting the method provided by the embodiment of the application, the first voice recognition model is analyzed to obtain operator information of the first voice recognition model, then, a model creation function belonging to Tensrt (a deep learning framework for reasoning acceleration) is used to create a second voice recognition model belonging to Tensrt based on the operator information, and because the model creation function is a function packaged in a function library, the network model can be automatically created by using the function, and the conversion efficiency of the voice recognition model is improved.

In addition, by adopting the method provided by the embodiment of the application, any type of network models such as a text recognition model, an intelligent voice model or a voice separation model can be converted.

Fig. 2 is a flowchart of a model transformation method provided in an embodiment of the present application, where the embodiment of the present application is executed by a computer device, and referring to fig. 2, the method includes:

201. and the computer equipment analyzes the first network model belonging to the first deep learning frame to obtain operator information of the first network model.

The computer device obtains a first network model, which may be a network model for performing any type of data processing task, e.g. a speech recognition model for recognizing speech data, or a speech separation model for separating speech data, or an intelligent speech model for intelligent question answering, or a text recognition model for recognizing text data, or an image recognition model for recognizing image data, etc.

Wherein the first network model belongs to a first deep learning framework. Optionally, the first deep learning framework is a deep learning training framework used for training the first network model, for example, the first deep learning training framework is a PyTorch deep learning training framework, and a model structure of a network model generated by using the PyTorch deep learning training framework is lower.

After obtaining the first network model, the computer device analyzes the first network model to obtain operator information of the first network model, where the operator information includes at least one operator in the first network model and a processing parameter corresponding to each operator, and the operator is used to map a numerical value from one function space to another function space to implement a certain operation on the numerical value, and thus any operation on the numerical value may be referred to as an operator. The processing parameters corresponding to the operator include a weight parameter and an attribute parameter of the operator, the weight parameter is a parameter used for participating in operation in the operator, and the attribute parameter refers to a parameter used for indicating how to perform operation, for example, the attribute parameter of the convolution operator includes the size of a convolution kernel, convolution step length, and the like.

202. The computer device inputs the operator information to a model creation function in the function library in response to a model creation instruction for the operator information.

After the computer device obtains the operator information of the first network model, the operator information can be provided for a technician, if the technician can perform a model creation operation for the operator information, a model creation instruction for the operator information is triggered and generated, and the computer device responds to the model creation instruction and inputs the operator information into a model creation function in a function library.

The function library comprises a plurality of packaged functions, and is a function library belonging to a second deep learning framework, and the model creating functions in the function library are used for creating network models belonging to the second deep learning framework.

The second deep learning framework can be any type of deep learning framework, and optionally, the second deep learning framework is a deep learning inference framework for performing inference acceleration on the network model, for example, the second deep learning inference framework is a TensorRT deep learning inference framework. Wherein. The TensorRT deep learning Inference framework is a high-performance deep learning Inference (Inference) optimizer and can provide low-delay and high-throughput deployment Inference for deep learning applications. TensorRT can be used for reasoning and accelerating a super-large scale data center, an embedded platform or an automatic driving platform.

203. The computer device calls the model creation function and creates a second network model based on the operator information.

Wherein the second network model belongs to a second deep learning framework. Since the operator information is operator information of the first network model, the operator information can determine functions that can be realized by the first network model, and the second network model is a network model created based on the operator information, although a deep learning framework to which the second network model belongs is different from a deep learning framework to which the first network model belongs, the second network model can realize the same functions as the first network model.

Because the model creating function is a function packaged in the function library, the computer device can create a corresponding second network model based on operator information by calling the model creating function, and can convert the network model belonging to the first deep learning frame into the network model belonging to the second deep learning frame.

According to the method provided by the embodiment of the application, the first network model belongs to the first deep learning frame, the operator information of the first network model is obtained by analyzing the first network model, and then the operator information is input into the encapsulated model creating function, so that the model creating function automatically creates the second network model belonging to the second deep learning frame based on the operator information, the network model belonging to the first deep learning frame is converted into the network model belonging to the second deep learning frame, the operation flow of converting the network model is simplified, and the conversion efficiency of the network model is improved.

On the basis of the embodiment shown in fig. 2, the computer device may create the second network model by filling the processing function corresponding to the operator in the blank network model, and the specific process is described in detail in the embodiment shown in fig. 3 below.

Fig. 3 is a flowchart of another model transformation method provided in an embodiment of the present application, where the embodiment of the present application is executed by a computer device, and referring to fig. 3, the method includes:

301. and the computer equipment analyzes the first network model belonging to the first deep learning frame to obtain operator information of the first network model.

Step 301 is similar to step 201, and will not be described herein again.

302. The computer device responds to a model creating instruction aiming at the operator information, inputs the operator information into a generating function in a function library, calls the generating function, and creates a third network model based on the operator information and a blank network model, wherein the third network model is a network model which is not subjected to acceleration processing.

After the computer device obtains the operator information of the first network model, the operator information can be provided for a technician, if the technician can perform a model creation operation aiming at the operator information, a model creation instruction aiming at the operator information is triggered and generated, and the computer device responds to the model creation instruction and inputs the operator information and the created blank network model into a generation function in a function library. Wherein the generating function is used for generating a network model which belongs to the second deep learning framework and is not subjected to acceleration processing.

Since the generating function is a function packaged in the function library, the computer device can generate a corresponding third network model by calling the generating function based on the operator information and the blank network model automatically, wherein the third network model belongs to the second deep learning framework and is a network model which is not subjected to acceleration processing.

In a possible implementation manner, the computer device queries, based on the operator information, a processing function corresponding to the at least one operator in the function library, inputs, to the generating function, a processing parameter and a processing function corresponding to the at least one operator, calls the generating function, and correspondingly fills the processing function corresponding to the processing parameter into the blank network model to obtain the third network model.

The function library comprises a plurality of processing functions belonging to the second deep learning frame, the processing functions are in one-to-one correspondence with operators belonging to the first deep learning frame, and the processing function corresponding to a certain operator is a processing function for realizing the same function as the operator. At least one operator in the operator information is an operator in the first network model, and the computer device queries a processing function corresponding to the at least one operator in the function library. The source of the processing function in the function library can be referred to the following embodiment shown in fig. 6, and will not be described here.

In addition, the computer device also creates a blank network model, and after querying the processing function corresponding to the at least one operator, the computer device inputs the processing function and the processing parameters corresponding to the at least one operator into a generating function, and the generating function is used for filling the processing function into the blank network model. Since the generating function is a function that is already packaged in the function library, the computer device can automatically fill the processing function corresponding to the processing parameter into the blank network model by calling the generating function, so as to obtain the third network model.

In another possible implementation, the computer device may further perform at least one of the following after populating the processing function to the blank network model.

(1) And the computer equipment responds to the verification instruction of the filled first processing function, calls the verification function, and verifies the first processing function to obtain a verification result.

The check function is a function which is packaged in a function library, and is used for checking any processing function. After the computer equipment fills the first processing function into the blank network model, if a technician wants to check whether the first processing function has errors, the computer equipment executes a check operation on the first processing function, triggers a check instruction on the first processing function, responds to the check instruction, calls the check function, and automatically checks the first processing function, so that a check result of the first processing function is obtained, wherein the check result is used for indicating whether the first processing function has errors. If the check result indicates that the first processing function has an error, the first processing function may be readjusted to correct the error of the first processing function.

(2) And the computer equipment responds to the identification setting instruction of the filled second processing function, calls the identification setting function, and determines the function identification of the second processing function as the function identification carried in the identification setting instruction.

The identification setting function is a function which is packaged in a function library, and is used for setting a function identification corresponding to any processing function. After filling the second processing function into the blank network model, if a technician wants to set a function identifier for the second processing function, the computer device executes an identifier setting operation on the second processing function, and triggers and generates an identifier setting instruction carrying the input function identifier.

(3) The computer device calls an input tagging function in response to the input tagging instruction for the populated third processing function, tagging the third processing function as the first processing function in the third network model.

The input marking function is a function which is packaged in a function library, and the input marking function is used for marking any processing function as a first processing function in the network model. After the computer device fills the third processing function into the blank network model, if the third processing function is the first processing function in the third network model, the technician may mark the third processing function, perform an input marking operation on the third processing function, and trigger generation of an input marking instruction for the third processing function, and the computer device, in response to the input marking instruction, calls the input marking function to automatically mark the third processing function as the first processing function in the third network model, where the first processing function is also an input node of the third network model.

It should be noted that, in this embodiment of the application, only one third processing function is taken as an example for description, and in a case that the third network model includes a plurality of processing functions belonging to the first processing function, the computer device may call the input marking instruction to respectively mark the plurality of processing functions as the first processing function in the third network model.

(4) The computer device calls an output tagging function in response to the output tagging instruction for the populated fourth processing function, tagging the fourth processing function as a last processing function in the third network model.

The output marking function is a function which is packaged in the function library, and the output marking function is used for marking any processing function as the last processing function in the network model. After the computer device fills the fourth processing function into the blank network model, if the fourth processing function is the last processing function in the fourth network model, the technician may mark the fourth processing function, execute an output marking operation on the fourth processing function, and trigger generation of an output marking instruction for the fourth processing function, and the computer device calls the output marking function in response to the output marking instruction, to automatically mark the fourth processing function as the last processing function in the fourth network model, where the last processing function is also an output node of the fourth network model.

It should be noted that, in this embodiment of the application, only one fourth processing function is taken as an example for description, and in a case that the fourth network model includes a plurality of processing functions belonging to a last processing function, the computer device may call the output marking instruction to respectively mark the plurality of processing functions as the last processing function in the fourth network model.

(5) The computer equipment responds to the test instruction of the filled fifth processing function, acquires first test data carried by the test instruction, and inputs the first test data and the fifth processing function into a first test function in the function library; calling the first test function, and triggering and adopting the fifth processing function to process the first test data to obtain a first processing result; and determining the processing performance of the fifth processing function based on the first processing result and the standard processing result corresponding to the first test data.

The first test function is a function which is packaged in a function library, and the fifth processing function is used for testing any processing function. After filling the fifth processing function into the blank network model, if a technician wants to test the processing performance of the fifth processing function, the computer device executes a test operation on the fifth processing function, triggers and generates a test instruction for the fifth processing function, and the test instruction also carries first test data for testing. The computer equipment responds to the test instruction, obtains first test data carried by the test instruction, inputs the first test data and the fifth processing function into the first test function to call the first test function, and automatically triggers and adopts the fifth processing function to process the first test data, so that a first processing result corresponding to the first test data is obtained.

The first test data is further corresponding to a standard processing result, and the first processing result is a processing result obtained by processing with a fifth processing function, so that the higher the similarity between the first processing result and the standard processing result is, the better the processing performance of the fifth processing function is, the lower the similarity between the first processing result and the standard processing result is, the worse the processing performance of the fifth processing function is, and therefore, the processing performance of the fifth processing function can be determined by comparing the first processing result with the standard processing result.

In the embodiment of the present application, after the fifth processing function is filled into the network model, the fifth processing function is tested, which is equivalent to testing a local portion of the network model first, and is also referred to as "unit test".

It should be noted that the first processing function, the second processing function, the third processing function, the fourth processing function, and the fifth processing function are any processing functions filled in the third network model. In addition, in addition to the fourth processing function and the fifth processing function being different, any two processing functions in the above five processing functions may be different processing functions, or may be the same processing function.

303. The computer device controls an auxiliary creating function in the function library in response to a creating instruction for the control function, and creates the control function and a parameter item of the control function.

The control function is used for a target processing function in the function library, and the target processing function is used for accelerating the network model. The control function is also called a "builder" in the network conversion process. For different network model conversion processes, the acceleration requirements for the network models are also different, so that each time the network models need to be converted, a control function for the current network model is created to call a target processing function to accelerate the current network model.

The auxiliary creating function is a function which is packaged in the function library, and the auxiliary creating function is used for creating the control function. And the technician executes the creation operation aiming at the control function, triggers and generates a corresponding creation instruction, and the creation instruction also carries the parameter item input by the technician. And the computer equipment responds to the creating instruction, inputs the parameter items carried in the creating instruction into the auxiliary creating function so as to call the auxiliary creating function and automatically create the control function and the parameter items of the control function.

For example, the parameter items of the control function include the maximum work content space (max _ workpace _ size) of the network model, whether to use INT8 (integer number), whether to use FP16 (half-precision floating point number), and the like.

304. The computer equipment responds to the parameter configuration instruction for the control function, controls the parameter configuration function in the function library, and configures parameter values for the parameter items of the control function.

The parameter configuration function is a function packaged in a function library, and is used for configuring parameter values for parameter items of the control function.

After the computer device creates the control function and the parameter item of the control function, the parameter value of the parameter item of the control function needs to be configured, a technician executes the parameter configuration operation on the control function and triggers to generate a corresponding parameter configuration instruction, and the parameter configuration instruction also carries the parameter value corresponding to the parameter item of the control function. And responding to the parameter configuration instruction by the computer equipment, inputting the parameter value carried by the parameter configuration instruction into the parameter configuration function so as to call the parameter configuration function, and configuring the parameter value carried by the parameter configuration instruction as the parameter value of the parameter item of the control function.

305. And the computer equipment calls the control function and triggers a target processing function in the control function library to carry out accelerated processing on the third network model to obtain a second network model.

The target processing function is a function packaged in the function library, and is used for accelerating any network model. After the computer device creates the third network model and the control function, the third network model is input to the control function to call the control function, and a target processing function in the control function library is triggered to automatically accelerate the third network model, so that the second network model is obtained. The second network model belongs to a second deep learning framework, and the second network model can realize the same functions as the first network model, so that the first network model belonging to the first deep learning framework is automatically converted into the second network model belonging to the second deep learning framework.

In a possible implementation manner, the first deep learning frame is a deep learning training frame, the second deep learning frame is a deep learning inference frame, and the deep learning inference frame is used for performing inference acceleration on the network model, so that the computer device performs acceleration processing on a third network model which is not subjected to acceleration processing after obtaining the third network model, and obtains the second network model.

Optionally, the acceleration processing includes at least one of operator fusion, network layer elimination, tensor fusion, weight quantization, video memory optimization, or precision optimization. Operator fusion refers to combining a plurality of operators into one operator, network layer elimination refers to eliminating a designated network layer, tensor fusion refers to fusing tensors in a network model, weight quantification refers to quantifying weights of the operators in the network model, video memory optimization refers to multiplexing a video memory pool, and precision optimization supports precision such as FP32 (single precision floating point number), FP16 (half precision floating point number) and INT8 (integer number).

The third network model is subjected to at least one of operator fusion, network layer elimination, tensor fusion, weight quantization, video memory optimization or precision optimization, so that reasoning acceleration of the third network model is realized, a second network model subjected to accelerated processing is obtained, and the processing efficiency of the second network model is improved.

306. And calling the target processing function by the computer equipment, and serializing the second network model to obtain the serialized second network model.

The target processing function is also used for serializing any network model. The serialization of the network model refers to the conversion of the network model into a form capable of being stored or transmitted, and the network model can be stored in a hard disk from a memory by serializing the network model.

After the computer equipment acquires the second network model, the second network model is input into the target processing function so as to call the target processing function and automatically serialize the second network model, so that the serialized second network model is obtained.

It should be noted that, in the embodiment of the present application, only the second network model is serialized after the second network model is obtained. In another embodiment, the step 306 may not be performed, that is, the computer device does not perform serialization after obtaining the second network model.

It should be noted that the above generation function, target processing function, auxiliary creation function, and the like constitute a model creation function for creating a network model belonging to the second deep learning framework, and therefore, by executing the above steps 302 to 306, a model creation function is realized in which the operator information is input into the function library in response to a model creation instruction for the operator information, the model creation function is called, and the second network model is created based on the operator information.

307. The computer device tests the second network model to determine the processing performance of the second network model.

The computer equipment responds to the test instruction aiming at the second network model, obtains second test data carried by the test instruction, inputs the second test data and the second network model into a second test function in the function library, calls the second test function, and triggers the second network model to process the second test data to obtain a second processing result. The computer device determines the processing performance of the second network model based on the second processing result and a standard processing result corresponding to the second test data.

The second test function is a function which is packaged in a function library, and the second test function is used for testing any network model. The computer device may also test a second network model belonging to a second deep learning framework after creating the second network model. And if the technician wants to test the processing performance of the second network model, executing the test operation of the second network model, triggering to generate a test instruction aiming at the second network model, wherein the test instruction also carries second test data for testing. And the computer equipment responds to the test instruction, acquires second test data carried by the test instruction, inputs the second test data and the second network model into the second test function to call the second test function, and automatically triggers the second network model to process the second test data so as to obtain a second processing result corresponding to the second test data.

The second test data is also corresponding to a standard processing result, and the second processing result is a processing result obtained by processing the second network model, so that the higher the similarity between the second processing result and the standard processing result is, the better the processing performance of the second network model is, the lower the similarity between the second processing result and the standard processing result is, the worse the processing performance of the second network model is, and therefore, the processing performance of the fifth processing function can be determined by comparing the first processing result with the standard processing result.

For example, the second network model is a speech recognition model, the second test data is speech data to be recognized, the speech data corresponds to a standard recognition result (e.g., a result obtained by manual recognition), the computer device calls a second test function, triggers and uses the speech recognition model to recognize the speech data, obtains a recognition result of the speech recognition model, and then compares the recognition result of the speech recognition model with the standard recognition result to determine the accuracy of the speech recognition model.

In addition, the second network model is triggered to be used for testing the second test data, and the processing efficiency of the second network model can be tested, so that whether the processing efficiency of the second network model is improved compared with that of the first network model after the first network model belonging to the first deep learning framework is converted into the second network model belonging to the second deep learning framework is determined.

In the embodiment of the application, after the second network model is created, the second network model is also subjected to offline reasoning verification to test the processing performance of the second network model, so that the conversion result of the network model is verified.

It should be noted that, in the embodiments of the present application, only the second network model is tested after being created as an example. In another embodiment, step 307 may not be performed, that is, the second network model is not tested after the second network model is created.

Fig. 4 is a flowchart of another model transformation method according to an embodiment of the present application, where a first deep learning framework is a PyTorch deep learning framework and a second deep learning model is a TensorRT deep learning framework, then the first network model is a network model 401 of PyTorch and the second network model is a network model 402 of TensorRT. The TensorRT deep learning framework comprises a TRT (TensorRT) network model conversion optimization engine and a TRT network model execution engine, wherein the TRT network model conversion optimization engine is used for converting a network model into the TensorRT network model, and the TRT network model execution engine is used for driving the TensorRT network model to process data. As shown in fig. 4, after acquiring the network model 401 of PyTorch, the computer device converts the network model 401 of PyTorch through the TRT network model conversion optimization engine to obtain the network model 402 of TensorRT, and then drives the network model 402 of TensorRT through the TRT network model execution engine to process data.

The computer device obtains the model creation function in the above embodiment by encapsulating the code, and the specific process is described in detail in the embodiment shown in fig. 5 below. Fig. 5 is a flowchart of a function encapsulation method provided in an embodiment of the present application, where the embodiment of the present application is executed by a computer device, and referring to fig. 5, the method includes:

501. the computer device responds to the function packaging instruction and obtains codes carried in the function packaging instruction.

If the technician wants to package the logic for creating the network model belonging to the second deep learning framework as a model creation function, corresponding code is written in the computer device for creating the model belonging to the second deep learning framework based on the operator information to be input. A technician executes a function encapsulation operation based on an input code, and triggers and generates a corresponding function encapsulation instruction, wherein the function encapsulation instruction carries the code. And the computer equipment responds to the function packaging instruction and acquires the code carried in the function packaging instruction.

502. And the computer equipment packages the code to obtain a model establishing function.

And after the computer equipment acquires the code carried in the function packaging instruction, packaging the code to obtain a model creating function, wherein the model creating function is used for creating the network model belonging to the second deep learning framework.

According to the method provided by the embodiment of the application, the code for creating the model belonging to the second deep learning frame is encapsulated into the model creating function, and then the strange model creating function is directly called when the network model belonging to the second deep learning frame needs to be created, so that technicians do not need to manually write the network model belonging to the second deep learning frame every time, and the usability of the model creating function and the creating efficiency of the network model are improved.

The computer device may obtain the processing function in the above embodiment by encapsulating the processing interface or creating a processing plug-in, which is described in detail in the embodiment shown in fig. 6 below. Fig. 6 is a flowchart of a function generation method provided in an embodiment of the present application, where the embodiment of the present application is executed by a computer device, and referring to fig. 6, the method includes:

601. and for a target operator belonging to the first deep learning frame, the computer equipment queries at least one first processing interface corresponding to the target operator in the first interface library.

The target operator is any operator belonging to the first deep learning framework, a first interface library belonging to a second deep learning framework is stored in the computer equipment, and the first interface library comprises a plurality of first processing interfaces belonging to the second deep learning framework. For the target operator, the computer device queries, in the first interface library, at least one first processing interface corresponding to the target operator, where the at least one first processing interface corresponding to the target operator is at least one first processing interface for implementing a function of the target operator.

For example, if only one first processing interface is needed to implement the function of the target operator, the target operator corresponds to one first processing interface, and if multiple first processing interfaces are needed to implement the function of the target operator, the target operator corresponds to multiple first processing interfaces.

The computer device queries at least one first processing interface corresponding to the target operator, and then performs step 602 or step 603 described below.

602. Under the condition that at least one first processing interface corresponding to the target operator is inquired, the computer equipment responds to an encapsulation instruction of the at least one first processing interface, encapsulates the at least one first processing interface to obtain a processing function corresponding to the target operator, and stores the processing function corresponding to the target operator in a function library.

If the computer device queries at least one first processing interface corresponding to the target operator in the first interface library, which indicates that at least one first processing interface capable of implementing the function of the target operator exists in the first interface library, a technician may perform a packaging operation on the at least one first processing interface to trigger a corresponding packaging instruction, and the computer device packages the at least one first processing interface in response to the packaging instruction, so as to obtain a processing function corresponding to the target operator, and store the processing function corresponding to the target operator in the function library.

Wherein, the processing function can realize the function of the target operator because the processing function is packaged with at least one first processing interface for realizing the function of the target operator. For example, if the target operator is a convolution operator, the processing function corresponding to the target operator can perform convolution processing. Since the at least one first processing interface belongs to the second deep learning framework, the processing function obtained by encapsulating the at least one first processing interface belongs to the second deep learning framework.

In the embodiment of the application, the processing function capable of realizing the function of the target operator is obtained by packaging the at least one first processing interface for realizing the function of the target operator, and the processing function can be subsequently and directly called for processing, so that the difficulty in using the at least one first processing interface is reduced.

603. The computer device determines at least one second processing interface corresponding to the target operator in a second interface library under the condition that at least one first processing interface corresponding to the target operator is not inquired, creates a processing plug-in including the at least one second processing interface in response to a function creation instruction aiming at the at least one second processing interface, packages the processing plug-in as a processing function corresponding to the target operator and belonging to the second deep learning framework, and stores the processing function corresponding to the target operator in the function library.

If the computer device does not inquire at least one first processing interface corresponding to the target operator in the first interface library, which indicates that at least one first processing interface capable of realizing the function of the target operator does not exist in the first interface library, the processing function belonging to the second deep learning framework cannot be obtained by directly packaging the first processing interface belonging to the second deep learning framework.

In this case, the computer device determines, in a second interface library, at least one second processing interface corresponding to the target operator, where the at least one second processing interface corresponding to the target operator refers to at least one second processing interface for implementing a function of the target operator, and the second interface library includes a plurality of second processing interfaces belonging to the first deep learning framework. Since the at least one second processing interface belongs to the first deep learning framework, the at least one second processing interface cannot be directly packaged as a processing function belonging to the second deep learning framework. Therefore, the computer device firstly creates a processing plug-in including the at least one second processing interface, and then packages the processing plug-in into a processing function belonging to the second deep learning framework, so as to obtain the processing function corresponding to the target operator.

In the related art, if at least one first processing interface capable of realizing the function of the target operator does not exist in the first interface library, a processing plug-in for realizing the function of the target operator is manually written, and then the processing plug-in is packaged as a processing function belonging to the second deep learning framework, but the writing difficulty and the efficiency of the processing plug-in are high. In the embodiment of the application, the processing plug-in is created by means of the second processing interface belonging to the first deep learning framework, so that the creating difficulty of the processing plug-in is reduced.

Fig. 7 is a flowchart of another model transformation method provided in an embodiment of the present application, and as shown in fig. 7, taking the example of transforming a network model of pytorre into a network model of TensorRT, the model transformation method includes the following four aspects.

And (I) creating a resource. The computer device creates a Builder (Builder, i.e., the above-mentioned control function) of TensorRT, creates a parameter item (BuilderConfig) of the Builder, and configures a parameter value for the parameter item of the Builder. The parameter items of the builder such as TensrT include maximum working memory space, whether INT8 is used, whether FP16 is used, and the like.

And (II) filling the network model. The computer device creates a blank network model by using the constructor, and for the operator supported by the TensorRT (having the TensorRT processing interface corresponding to the operator), directly encapsulates the corresponding TensorRT processing interface as a processing function and adds the processing function into the blank network model. For an operator which is not supported by the TensorRT (the TensorRT processing interface corresponding to the operator is not available), after a processing plug-in including a libtorch gpu interface (a torch interface in a C + + format) is created, the corresponding processing plug-in is packaged into a processing function and added into the blank network model.

And (III) optimizing the network model. And after the processing function is added to the blank network model, carrying out acceleration processing on the obtained network model, wherein the acceleration processing comprises operator fusion, tensor fusion, weight quantification, precision optimization and the like. And then serializing the network model after the acceleration processing to obtain a serialized network model, and storing the serialized network model in a hard disk.

And (IV) performing offline reasoning verification on the network model. After the network model is established, reasoning verification can be carried out on the network model. And the computer equipment reads the stored network model, performs deserialization on the network model, processes the network model after deserialization to obtain a processing result, and compares and verifies the processing result and the standard processing result.

Fig. 8 is a system architecture diagram of a model conversion method according to an embodiment of the present application, and as shown in fig. 8, the method includes four modules for performing model conversion, each including a resource creation module, a model filling module, a model test module, and a function test module, and for convenience of description, the following describes the four modules by taking a network model of PyTorch converted into a TensorRT network model as an example.

The resource creating module. The resource creating module includes a function obtained by encapsulating the created resources and the logic for optimizing the network model shown in fig. 7, for example, the resource creating module includes an auxiliary creating function, a parameter configuring function, and an objective processing function. The auxiliary creating function is used for creating a constructor of the TensorRT, creating parameter items of the constructor and a blank network model. The parameter configuration function is used for configuring parameter values for the parameter items of the builder. The target processing function is used for performing acceleration processing, serialization and the like on the filled network model.

And (II) a model filling module. The model filling module is divided into three submodules, namely a basic function submodule, a first class processing function submodule and a second class processing function submodule.

The basic function sub-module is used for assisting in filling the network model, setting function identification of a processing function in the network model, marking input nodes and output nodes in the network model and the like.

The first type of processing function sub-module comprises a plurality of processing functions, the processing functions correspond to the PyTorch operators one by one and are obtained by packaging processing interfaces of TensorRT or by packaging written processing plugins. It should be noted that the plurality of processing functions in the first-class processing function sub-module include a processing function obtained by encapsulating a TensorRT processing interface, and further include a processing function obtained by encapsulating a processing plug-in, where the processing plug-in is created based on a libtorch gpu interface.

The sub-module of the second kind of processing function includes processing functions related to tensors, such as four arithmetic, logarithmic function and matrix transformation function.

And (III) a model test module. The model testing module is used for helping technicians to simplify an offline reasoning and verifying process of a network model, the created model can be controlled to process test data by calling a function in the model testing module, and then the processing performance of the model is judged according to a processing result.

And (IV) a function testing module. In the embodiment of the present application, in addition to testing the created network model, in the process of creating the network model, a unit test may be performed on the processing function filled in the network model to ensure the processing performance of the entire network model.

Fig. 9 is a schematic structural diagram of a model transformation apparatus according to an embodiment of the present application. Referring to fig. 9, the apparatus includes:

the analysis module 901 is configured to analyze a first network model belonging to a first deep learning frame to obtain operator information of the first network model, where the operator information includes at least one operator and a processing parameter corresponding to each operator;

an input module 902, configured to input the operator information into a model creation function in the function library in response to a model creation instruction for the operator information, where the model creation function is used to create a model belonging to the second deep learning framework;

and a creating module 903, configured to invoke the model creating function, and create a second network model based on the operator information.

According to the model conversion device provided by the embodiment of the application, the first network model belongs to the first deep learning frame, the operator information of the first network model is obtained by analyzing the first network model, and then the operator information is input into the packaged model creating function, so that the model creating function automatically creates the second network model belonging to the second deep learning frame based on the operator information, the network model belonging to the first deep learning frame is converted into the network model belonging to the second deep learning frame, the operation flow for converting the network model is simplified, and the conversion efficiency of the network model is improved.

Optionally, referring to fig. 10, the apparatus further comprises:

a function encapsulation module 904, configured to, in response to a function encapsulation instruction, obtain a code carried in the function encapsulation instruction, where the code is used to create a model belonging to the second deep learning framework based on operator information to be input;

the function encapsulation module 904 is further configured to encapsulate the code, so as to obtain the model creation function.

Optionally, referring to fig. 10, the model creation function includes a generation function and an objective processing function, the generation function and the objective processing function belong to the second deep learning framework; the creation module comprises:

a creating unit 913, configured to invoke the generating function, and generate a third network model based on the operator information and the blank network model, where the third network model is a network model that has not been subjected to acceleration processing;

and the target processing unit 923 is configured to call the target processing function, and perform acceleration processing on the third network model to obtain the second network model.

Optionally, referring to fig. 10, the creating unit 913 is configured to:

Optionally, referring to fig. 10, the apparatus further comprises:

the query module 905 is configured to query, in a first interface library, at least one first processing interface corresponding to a target operator belonging to the first deep learning frame, where the at least one first processing interface corresponding to the target operator refers to at least one first processing interface for implementing a function of the target operator, the target operator is any one operator belonging to the first deep learning frame, and the first interface library includes a plurality of first processing interfaces belonging to the second deep learning frame;

a function encapsulation module 904, configured to, in a case that at least one first processing interface corresponding to the target operator is queried, respond to an encapsulation instruction for the at least one first processing interface, encapsulate the at least one first processing interface to obtain a processing function corresponding to the target operator, and store the processing function corresponding to the target operator in the function library;

a function creating module 906, configured to determine, in a second interface library, at least one second processing interface corresponding to the target operator if at least one first processing interface corresponding to the target operator is not queried, create, in response to a function creating instruction for the at least one second processing interface, a processing plugin including the at least one second processing interface, package the processing plugin into a processing function corresponding to the target operator and belonging to the second deep learning framework, store the processing function corresponding to the target operator in the function library, where the second interface library includes a plurality of second processing interfaces belonging to the first deep learning framework.

Optionally, referring to fig. 10, the apparatus further comprises:

the auxiliary processing module 907 is configured to, in response to a check instruction for the filled first processing function, invoke a check function, check the first processing function to obtain a check result, where the check result is used to indicate whether the first processing function has an error;

the auxiliary processing module 907 is configured to, in response to the identifier setting instruction for the filled second processing function, invoke an identifier setting function, and determine the function identifier of the second processing function as the function identifier carried in the identifier setting instruction;

the auxiliary processing module 907 is configured to, in response to an input marking instruction for the filled third processing function, call an input marking function, and mark the third processing function as a first processing function in the third network model;

the auxiliary processing module 907 is configured to, in response to an output marking instruction for the populated fourth processing function, call an output marking function, and mark the fourth processing function as a last processing function in the third network model.

Optionally, referring to fig. 10, the apparatus further comprises:

a test module 908, configured to, in response to a test instruction for the filled fifth processing function, obtain first test data carried in the test instruction, and input the first test data and the fifth processing function to the first test function in the function library;

the test module 908 is further configured to invoke the first test function, and trigger to process the first test data by using the fifth processing function to obtain a first processing result;

the test module 908 is configured to determine the processing performance of the fifth processing function based on the first processing result and a standard processing result corresponding to the first test data.

Optionally, referring to fig. 10, the target processing unit 923 is further configured to call the target processing function, and serialize the second network model to obtain a serialized second network model.

Optionally, referring to fig. 10, the model creation function further includes an auxiliary creation function and a parameter configuration function, and the target processing unit 923 is configured to:

in response to a creation instruction for a control function, calling the auxiliary creation function, creating the control function and a parameter item of the control function, wherein the control function is used for calling the target processing function;

Optionally, referring to fig. 10, the apparatus further comprises:

a test module 908, configured to, in response to a test instruction for the second network model, obtain second test data carried in the test instruction, and input the second test data and the second network model to a second test function in the function library;

the test module 908 is further configured to invoke the second test function, and trigger to process the second test data by using the second network model to obtain a second processing result;

the testing module 908 is further configured to determine a processing performance of the second network model based on the second processing result and a standard processing result corresponding to the second test data.

It should be noted that: the model conversion apparatus provided in the foregoing embodiment is only illustrated by the division of the functional modules, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the computer device is divided into different functional modules to complete all or part of the functions described above. In addition, the model conversion apparatus and the model conversion method provided in the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments and are not described herein again.

The embodiment of the present application further provides a computer device, where the computer device includes a processor and a memory, where the memory stores at least one computer program, and the at least one computer program is loaded and executed by the processor to implement the operations performed in the model conversion method of the foregoing embodiment.

Optionally, the computer device is provided as a terminal. Fig. 11 illustrates a schematic structural diagram of a terminal 1100 according to an exemplary embodiment of the present application.

The terminal 1100 includes: a processor 1101 and a memory 1102.

Processor 1101 may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. The processor 1101 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (field Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 1101 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 1101 may be integrated with a GPU (Graphics Processing Unit, image Processing interactor) that is responsible for rendering and drawing content that a display screen needs to display. In some embodiments, the processor 1101 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 1102 may include one or more computer-readable storage media, which may be non-transitory. Memory 1102 can also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1102 is used to store at least one computer program for being possessed by processor 1101 to implement the model conversion methods provided by the method embodiments herein.

In some embodiments, the terminal 1100 may further include: a peripheral interface 1103 and at least one peripheral. The processor 1101, memory 1102 and peripheral interface 1103 may be connected by a bus or signal lines. Various peripheral devices may be connected to the peripheral interface 1103 by buses, signal lines, or circuit boards. Optionally, the peripheral device comprises: at least one of radio frequency circuitry 1104, display screen 1105, camera assembly 1106, audio circuitry 1107, and power supply 1108.

The peripheral interface 1103 may be used to connect at least one peripheral device related to I/O (Input/Output) to the processor 1101 and the memory 1102. In some embodiments, the processor 1101, memory 1102, and peripheral interface 1103 are integrated on the same chip or circuit board; in some other embodiments, any one or both of the processor 1101, the memory 1102, and the peripheral device interface 1103 may be implemented on separate chips or circuit boards.

The Radio Frequency circuit 1104 is used to receive and transmit RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuit 1104 communicates with communication networks and other communication devices via electromagnetic signals. The radio frequency circuit 1104 converts an electric signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electric signal. Optionally, the radio frequency circuit 1104 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, etc.

The display screen 1105 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 1105 is a touch display screen, the display screen 1105 also has the ability to capture touch signals on or over the surface of the display screen 1105. The touch signal may be input to the processor 1101 as a control signal for processing. At this point, the display screen 1105 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard.

Those skilled in the art will appreciate that the configuration shown in fig. 11 does not constitute a limitation of terminal 1100, and may include more or fewer components than those shown, or may combine certain components, or may employ a different arrangement of components.

Optionally, the computer device is provided as a server. Fig. 12 is a schematic structural diagram of a server 1200 according to an embodiment of the present application, where the server 1200 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 1201 and one or more memories 1202, where the memory 1202 stores at least one computer program, and the at least one computer program is loaded and executed by the processors 1201 to implement the methods provided by the foregoing method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.

The embodiment of the present application further provides a computer-readable storage medium, where at least one computer program is stored in the computer-readable storage medium, and the at least one computer program is loaded and executed by a processor to implement the operations performed by the model transformation method of the foregoing embodiment.

Embodiments of the present application further provide a computer program product, which includes a computer program, and the computer program is loaded and executed by a processor to implement the operations performed by the model conversion method according to the above embodiments. In some embodiments, the computer program according to the embodiments of the present application may be deployed to be executed on one computer device or on multiple computer devices located at one site, or may be executed on multiple computer devices distributed at multiple sites and interconnected by a communication network, and the multiple computer devices distributed at the multiple sites and interconnected by the communication network may constitute a block chain system.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only an alternative embodiment of the present application and should not be construed as limiting the present application, and any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method of model conversion, the method comprising:

2. The method according to claim 1, wherein before inputting the operator information into a model creation function in a function library in response to a model creation instruction for the operator information, the method further comprises:

responding to a function packaging instruction, and acquiring a code carried in the function packaging instruction, wherein the code is used for creating a model belonging to the second deep learning framework based on operator information to be input;

and packaging the code to obtain the model creating function.

3. The method of claim 1, wherein the model creation function comprises a generation function and an objective processing function, the generation function and the objective processing function belonging to the second deep learning framework; the calling the model creating function, and creating a second network model based on the operator information, including:

calling the generating function, and generating a third network model based on the operator information and the blank network model, wherein the third network model is a network model which is not subjected to accelerated processing;

and calling the target processing function to perform accelerated processing on the third network model to obtain the second network model.

4. The method of claim 3, wherein said invoking said generating function to generate a third network model based on said operator information and a blank network model comprises:

5. The method of claim 4, further comprising:

for a target operator belonging to the first deep learning frame, querying at least one first processing interface corresponding to the target operator in a first interface library, where the at least one first processing interface corresponding to the target operator refers to at least one first processing interface for implementing a function of the target operator, the target operator is any one operator belonging to the first deep learning frame, and the first interface library includes a plurality of first processing interfaces belonging to the second deep learning frame;

under the condition that at least one first processing interface corresponding to the target operator is inquired, responding to an encapsulation instruction of the at least one first processing interface, encapsulating the at least one first processing interface to obtain a processing function corresponding to the target operator, and storing the processing function corresponding to the target operator in the function library;

under the condition that at least one first processing interface corresponding to the target operator is not inquired, at least one second processing interface corresponding to the target operator is determined in a second interface library, a processing plug-in including the at least one second processing interface is created in response to a function creating instruction aiming at the at least one second processing interface, the processing plug-in is packaged into a processing function corresponding to the target operator and belonging to the second deep learning framework, the processing function corresponding to the target operator is stored in the function library, and the second interface library comprises a plurality of second processing interfaces belonging to the first deep learning framework.

6. The method according to claim 4, wherein after the processing function corresponding to the processing parameter is correspondingly filled into the blank network model to obtain the third network model, the method further comprises at least one of:

responding to a checking instruction of the filled first processing function, calling a checking function, checking the first processing function to obtain a checking result, wherein the checking result is used for indicating whether the first processing function has errors;

responding to an identifier setting instruction of the filled second processing function, calling the identifier setting function, and determining a function identifier of the second processing function as a function identifier carried in the identifier setting instruction;

in response to an input marking instruction for the populated third processing function, calling the input marking function to mark the third processing function as the first processing function in the third network model;

in response to an output marking instruction for the populated fourth processing function, an output marking function is invoked that marks the fourth processing function as the last processing function in the third network model.

7. The method according to claim 4, wherein after the processing function corresponding to the processing parameter is correspondingly filled into the blank network model to obtain the third network model, the method further comprises:

responding to a test instruction of a filled fifth processing function, acquiring first test data carried by the test instruction, and inputting the first test data and the fifth processing function into a first test function in the function library;

calling the first test function, and triggering and adopting the fifth processing function to process the first test data to obtain a first processing result;

and determining the processing performance of the fifth processing function based on the standard processing result corresponding to the first processing result and the first test data.

8. The method of claim 3, further comprising:

and calling the target processing function, and serializing the second network model to obtain the serialized second network model.

9. The method according to claim 3, wherein the model creation function further includes an auxiliary creation function and a parameter configuration function, the calling the target processing function to perform accelerated processing on the third network model to obtain the second network model includes:

10. The method according to any of claims 1-9, wherein said invoking said model creation function, after creating a second network model based on said operator information, further comprises:

responding to a test instruction aiming at the second network model, acquiring second test data carried by the test instruction, and inputting the second test data and the second network model into a second test function in the function library;

calling the second test function, and triggering the second test data to be processed by adopting the second network model to obtain a second processing result;

and determining the processing performance of the second network model based on the standard processing result corresponding to the second processing result and the second test data.

11. A model transformation apparatus, characterized in that the apparatus comprises:

12. The apparatus of claim 11, further comprising:

13. A computer device, characterized in that it comprises a processor and a memory, in which at least one computer program is stored, which is loaded and executed by the processor to implement the operations performed by the model transformation method according to any one of claims 1 to 10.

14. A computer-readable storage medium, having stored therein at least one computer program, which is loaded and executed by a processor, to perform the operations performed by the model transformation method of any one of claims 1 to 10.

15. A computer program product comprising a computer program that is loaded and executed by a processor to perform the operations performed by the model transformation method of any of claims 1 to 10.