CN114942782A - Code migration method and device of model - Google Patents

Code migration method and device of model Download PDF

Info

Publication number
CN114942782A
CN114942782A CN202111122126.2A CN202111122126A CN114942782A CN 114942782 A CN114942782 A CN 114942782A CN 202111122126 A CN202111122126 A CN 202111122126A CN 114942782 A CN114942782 A CN 114942782A
Authority
CN
China
Prior art keywords
code
nodes
node
converted
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111122126.2A
Other languages
Chinese (zh)
Inventor
王恺
王世领
何剑
项乐强
胡晓辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202111122126.2A priority Critical patent/CN114942782A/en
Publication of CN114942782A publication Critical patent/CN114942782A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/76Adapting program code to run in a different environment; Porting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/436Semantic checking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Stored Programmes (AREA)

Abstract

The application provides a code migration method and device of a model in the field of artificial intelligence. The method comprises the following steps: acquiring first code of a model constructed based on first hardware equipment; converting the first code into a syntax tree, and analyzing the syntax tree to obtain a node to be converted; replacing the node to be converted into one or more target nodes corresponding to the node to be converted so as to obtain a replaced syntax tree, wherein the target nodes are nodes supported by second hardware equipment; generating a second code based on the replaced syntax tree. The method and the device are beneficial to realizing the automatic migration of the code of the model across hardware platforms, and can improve the migration efficiency of the code of the model.

Description

Code migration method and device of model
Technical Field
The embodiment of the application relates to the field of artificial intelligence, in particular to a code migration method and device of a model.
Background
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. Research in the field of artificial intelligence includes robotics, natural language processing, computer vision, decision and reasoning, human-computer interaction, recommendation and search, AI basic theory, and the like.
The number of hardware platforms for deep learning is large. In some scenarios, a user needs to migrate code of a model built on one hardware platform under another hardware platform. The code migration tool of the model can realize the code migration between different hardware platforms.
However, the migration effect of the code migration tool of the existing model is not ideal. The X2Paddle is a code migration tool of a model in a propeller ecology and can support conversion of the model of a mainstream deep learning framework into the model of a propeller framework. However, X2Paddle is less generalizable and can only support partial scenarios, often requiring users to have experience in developing algorithms to modify code. For example, before migration, a user needs to replace operators of a sensor such as logical and, or, xor, and the like with corresponding Application Program Interfaces (APIs), and a parent inheritance relationship needs to be specified for a user-defined data set, so that automatic migration is difficult to achieve, migration efficiency of a model is reduced, and user experience is affected.
Disclosure of Invention
The application provides a code migration method and device of a model, which are beneficial to realizing automatic migration of codes of the model across hardware platforms and can improve the migration efficiency of the codes of the model.
In a first aspect, a method for code migration of a model is provided, which includes: obtaining a first code of a model, wherein the first code is constructed based on a first hardware device; converting the first code into a syntax tree, and analyzing the syntax tree to obtain nodes to be converted, wherein the nodes to be converted comprise at least one of the following items: an application program interface API to be converted or an operator to be converted; replacing the node to be converted into one or more target nodes corresponding to the node to be converted so as to obtain a replaced syntax tree, wherein the target nodes are nodes supported by second hardware equipment; generating a second code based on the replaced syntax tree.
According to the scheme of the embodiment of the application, in the embodiment of the application, the syntax information in the code can be obtained by utilizing the syntax tree to perform semantic analysis, for example, the nesting relation between operators is obtained, so that each node in the first code can be favorably and accurately identified, the positioning accuracy of the node to be converted is improved, and the accuracy of the code after the migration is favorably improved. Meanwhile, the nodes to be converted are replaced on the syntax tree, the second codes are generated based on the replaced syntax tree, and the structure of the syntax tree is utilized to reduce the process of artificial participation, for example, the processes of semantic analysis on the codes and designation of the nodes to be converted are reduced, so that the complete migration of the codes of the model is facilitated, and the code migration efficiency across hardware platforms can be improved.
The deep learning frames adopted by the first hardware device and the second hardware device may be the same or different.
The model in the embodiments of the present application may be a deep learning model, for example, a neural network model.
The first code of the model may comprise a training code of the model or a prediction code of the model.
Illustratively, the first code of the model may be structured into a tree structure, i.e., a syntax tree, by a Concrete Syntax Tree (CST) or an Abstract Syntax Tree (AST). The nodes to be converted can be located according to the syntax tree structure.
With reference to the first aspect, in some implementation manners of the first aspect, the one or more target nodes corresponding to the first node in the nodes to be converted include a plurality of nodes supported by the second hardware device, a function implemented by the first node is the same as a function implemented by a combination of the plurality of nodes, and the first node does not support conversion using the conversion rule.
The first node may be understood as a type of node, and nodes that do not support conversion using the conversion rule may be understood as the first node.
In the embodiment of the application, for a first node which does not support conversion by using a conversion rule, the first node can be converted into a combination of a plurality of nodes supported by a second device, and the combination of the plurality of nodes can be operated on the second device to realize a function consistent with that of the first node, thereby being beneficial to improving the conversion capability of a model, improving the performance of converted codes, reducing the number and range of codes to be modified, having less requirements on algorithm development experience of a user, being capable of reducing labor cost and realizing cross-platform migration.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: and providing alarm information, wherein the alarm information is used for indicating the first node.
Therefore, the nodes with possible problems can be prompted to the user, and the user experience is improved.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: running the second code on the second hardware equipment to obtain an operation result of an intermediate variable in the running process of the second code, wherein the operation result of the intermediate variable comprises a value of the intermediate variable or a data type of the intermediate variable; and adjusting the second code according to the operation result of the intermediate variable to obtain the third code.
In the scheme of the embodiment of the application, the running process is dynamically analyzed to obtain the operation result of the intermediate variable in the running process, the converted second code is further adjusted, and the accuracy of the third code is improved.
With reference to the first aspect, in some implementations of the first aspect, adjusting the second code according to an operation result of the intermediate variable to obtain the third code includes: and under the condition that the second hardware equipment does not support the operation result of the intermediate variable, adjusting the statement related to the operation result of the intermediate variable in the second code to obtain the third code.
The specific adjustment mode can be set according to the requirement. For example, adjusting the statement in the second code that is related to the intermediate variable may include at least one of: adding a statement related to the operation result of the intermediate variable in the second code, and deleting or modifying the statement related to the operation result of the intermediate variable in the second code.
With reference to the first aspect, in some implementations of the first aspect, the second hardware device does not support an operation result of the intermediate variable, and may include at least one of: the second hardware device does not support the data type of the intermediate variable, or the value of the intermediate variable exceeds a range supported by the second hardware device.
With reference to the first aspect, in certain implementations of the first aspect, the nodes to be converted include nodes that include a target key, the target key indicating the first hardware device.
The target keyword is used to indicate the first hardware device, and may be understood as indicating the first hardware device itself, or may be understood as indicating the first deep learning framework adopted by the first hardware device.
In the embodiment of the present application, a node containing a target keyword is used as a node to be converted, that is, a node related to a first hardware device is used as a node to be converted, so that code migration of a model between different hardware devices is realized, unnecessary conversion can be reduced as much as possible, for example, a node unrelated to a hardware device may not be converted, and thus conversion efficiency of the model is improved.
With reference to the first aspect, in certain implementations of the first aspect, the target keyword includes at least one of: the first target keyword or the second target keyword. Wherein the first target keyword may be determined by the first mapping relationship. The first mapping relation is used for indicating the corresponding relation between the plurality of first candidate keywords and the plurality of hardware devices. The first target keyword belongs to a plurality of first candidate keywords. The first hardware device belongs to a plurality of hardware devices. The second target keyword may be determined by the second mapping relationship. The second mapping relation is used for indicating the corresponding relation between the plurality of second candidate keywords and the plurality of deep learning frames. The second target keyword belongs to a plurality of second candidate keywords. The first deep learning framework belongs to a plurality of deep learning frameworks.
Optionally, the first mapping relationship is updated based on the target key and the first hardware device.
That is, the target keyword is added to the first candidate keyword as the first candidate keyword corresponding to the first hardware device.
Therefore, the first mapping relation is enriched, and the code migration efficiency of the model is further improved.
Optionally, the second mapping relationship is updated based on the target keyword and the first deep learning framework.
Namely, the target keyword is added to the second candidate keyword to be used as the second candidate keyword corresponding to the first deep learning frame.
Therefore, the second mapping relation is enriched, and the code migration efficiency of the model is further improved.
With reference to the first aspect, in some implementation manners of the first aspect, one or more target nodes corresponding to a second node in the nodes to be converted are one or more nodes supported by a second hardware device obtained by conversion using a conversion rule, and the second node supports conversion using the conversion rule.
The second node may be understood as a type of node, and nodes supporting conversion using the conversion rule may be referred to as second nodes.
In the embodiment of the application, for the second node supporting the conversion by using the conversion rule, the conversion can be directly performed based on the conversion rule, which is beneficial to improving the conversion efficiency of the model.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: and outputting an analysis report, wherein the analysis report is used for indicating the node to be converted and one or more target nodes corresponding to the node to be converted.
Illustratively, the node to be converted and the one or more target nodes to which the node to be converted corresponds include at least one of: the first node and one or more target nodes corresponding to the first node, or the second node and one or more target nodes corresponding to the second node.
Therefore, information of conversion operation can be provided for the user, the user can judge the conversion condition conveniently, for example, the user can judge whether to need further modification based on the information of the conversion operation, and the user experience is improved.
In a second aspect, an apparatus for code migration of a model is provided, the apparatus comprising means for performing the method of the first aspect or any implementation manner of the first aspect.
In a third aspect, an apparatus for code migration of a model is provided, the apparatus comprising: a memory for storing a program; a processor for executing the memory-stored program, the processor being configured to perform the method of the first aspect or any implementation of the first aspect when the memory-stored program is executed.
In a fourth aspect, a computer readable medium is provided, which stores program code for execution by a device, the program code comprising instructions for performing the method of the first aspect or any implementation of the first aspect.
In a fifth aspect, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the first aspect or any implementation of the first aspect.
In a sixth aspect, a chip is provided, where the chip includes a processor and a data interface, and the processor reads instructions stored in a memory through the data interface to perform the method in the first aspect or any implementation manner of the first aspect.
Optionally, as an implementation manner, the chip may further include a memory, where instructions are stored in the memory, and the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the processor is configured to execute the first aspect or the method in any implementation manner of the first aspect.
In a seventh aspect, an electronic device is provided, where the electronic device includes the apparatus in the first aspect or any implementation manner of the first aspect.
Drawings
FIG. 1 is a schematic diagram of an artificial intelligence agent framework provided by an embodiment of the present application;
fig. 2 is a schematic structural diagram of a system architecture according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a code migration tool of a model according to an embodiment of the present application;
FIG. 4 is a schematic flow chart diagram of a method for code migration of a model according to an embodiment of the present application;
FIG. 5 is a schematic block diagram of a code migration apparatus of a model provided by an embodiment of the present application;
fig. 6 is a schematic block diagram of a code migration apparatus according to another model provided in an embodiment of the present application.
Detailed Description
The technical solution in the present application will be described below with reference to the accompanying drawings.
FIG. 1 shows a schematic diagram of an artificial intelligence body framework that describes the overall workflow of an artificial intelligence system, applicable to the general artificial intelligence field requirements.
The artificial intelligence topic framework described above is set forth below in terms of two dimensions, the "intelligent information chain" (horizontal axis) and the "IT value chain" (vertical axis).
The "smart information chain" reflects a list of processes processed from the acquisition of data. For example, the general processes of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision making and intelligent execution and output can be realized. In this process, the data undergoes a "data-information-knowledge-wisdom" refinement process.
The 'IT value chain' reflects the value of the artificial intelligence to the information technology industry from the bottom infrastructure of the human intelligence, information (realization of providing and processing technology) to the industrial ecological process of the system.
(1) Infrastructure:
the infrastructure provides computing power support for the artificial intelligent system, realizes communication with the outside world, and realizes support through a foundation platform. Communicating with the outside through a sensor; the computing power is provided by intelligent chips (hardware acceleration chips such as CPU, NPU, GPU, ASIC, FPGA and the like); the basic platform comprises distributed computing framework, network and other related platform guarantees and supports, and can comprise cloud storage and computing, interconnection and intercommunication networks and the like. For example, sensors and external communications acquire data that is provided to smart chips in a distributed computing system provided by the underlying platform for computation.
(2) Data of
Data at the upper level of the infrastructure is used to represent the data source for the field of artificial intelligence. The data relates to graphs, images, voice and texts, and also relates to the data of the Internet of things of traditional equipment, including service data of the existing system and sensing data such as force, displacement, liquid level, temperature, humidity and the like.
(3) Data processing
Data processing typically includes data training, machine learning, deep learning, searching, reasoning, decision making, and the like.
The machine learning and the deep learning can be used for performing symbolized and formalized intelligent information modeling, extraction, preprocessing, training and the like on data.
Inference means a process of simulating an intelligent human inference mode in a computer or an intelligent system, using formalized information to think about and solve a problem by a machine according to an inference control strategy, and a typical function is searching and matching.
The decision-making refers to a process of making a decision after reasoning intelligent information, and generally provides functions of classification, sequencing, prediction and the like.
(4) General purpose capabilities
After the above-mentioned data processing, further general capabilities may be formed based on the results of the data processing, such as algorithms or a general system, for example, translation, analysis of text, computer vision processing, speech recognition, recognition of images, and so on.
(5) Intelligent product and industrial application
The intelligent product and industry application refers to the product and application of an artificial intelligence system in various fields, and is the encapsulation of an artificial intelligence integral solution, the intelligent information decision is commercialized, and the landing application is realized, and the application field mainly comprises: intelligent manufacturing, intelligent transportation, intelligent house, intelligent medical treatment, intelligent security protection, autopilot, smart city, intelligent terminal etc..
The method and the device for the deep learning framework conversion can be applied to the scenes of conversion among models of different deep learning frameworks in the fields of automatic driving, image classification, image retrieval, image semantic segmentation, image quality enhancement, image super-resolution, monitoring, target tracking, target detection and the like.
Since the embodiments of the present application relate to the application of a large number of neural networks, for the sake of understanding, the following description will be made first of all with respect to terms and concepts of the neural networks to which the embodiments of the present application may relate.
(1) Neural network
The neural network may be composed of neural units, which may be referred to as x s And an arithmetic unit with intercept 1 as input, the output of which may be:
Figure BDA0003277640510000051
wherein s is 1, 2, … … n, n is a natural number greater than 1, and W is s Is x s B is the bias of the neural unit.
f is the activation functions of the neural units for introducing non-linear characteristics into the neural network to transform the input signals in the neural units into output signals. The output signal of the activation function may be used as an input for the next layer. For example, the activation function may be a ReLU, tanh, or sigmoid function.
A neural network is a network formed by a plurality of the above-mentioned single neural units being joined together, i.e. the output of one neural unit may be the input of another neural unit. The input of each neural unit can be connected with the local receiving domain of the previous layer to extract the characteristics of the local receiving domain, and the local receiving domain can be a region composed of a plurality of neural units.
(2) Deep neural network
Deep Neural Networks (DNNs), also called multi-layer neural networks, can be understood as neural networks with multiple hidden layers. The DNNs are divided according to the positions of different layers, and the neural networks inside the DNNs can be divided into three categories: input layer, hidden layer, output layer. Generally, the first layer is an input layer, the last layer is an output layer, and the middle layers are hidden layers. The layers are all connected, that is, any neuron of the ith layer is necessarily connected with any neuron of the (i + 1) th layer.
Although DNN appears complex, it is not really complex in terms of the work of each layer, simply the following linear relational expression:
Figure BDA0003277640510000052
wherein,
Figure BDA0003277640510000053
is the input vector of the input vector,
Figure BDA0003277640510000054
is the output vector of the output vector,
Figure BDA0003277640510000055
is an offset vector, W is a weight matrix (also called coefficient)) And α () is an activation function. Each layer is only for the input vector
Figure BDA0003277640510000056
The output vector is obtained through such a simple operation. Due to the large number of DNN layers, the coefficient W and the offset vector
Figure BDA0003277640510000057
The number of the same is also large. The definition of these parameters in DNN is as follows: taking the coefficient W as an example: assume that in a three-layer DNN, the linear coefficients of the 4 th neuron of the second layer to the 2 nd neuron of the third layer are defined as
Figure BDA0003277640510000061
The superscript 3 represents the number of layers in which the coefficient W is located, while the subscripts correspond to the third layer index 2 of the output and the second layer index 4 of the input.
In summary, the coefficients from the kth neuron at layer L-1 to the jth neuron at layer L are defined as
Figure BDA0003277640510000062
Note that the input layer is without the W parameter. In deep neural networks, more hidden layers make the network more able to depict complex situations in the real world. Theoretically, the more parameters the higher the model complexity, the larger the "capacity", which means that it can accomplish more complex learning tasks. The final objective of the process of training the deep neural network, i.e., learning the weight matrix, is to obtain the weight matrix (the weight matrix formed by the vectors W of many layers) of all the layers of the deep neural network that is trained.
(3) Loss function
In the process of training the deep neural network, because the output of the deep neural network is expected to be as close as possible to the value really expected to be predicted, the weight vector of each layer of the neural network can be updated according to the difference between the predicted value of the current network and the really expected target value (of course, a process which is usually changed before the first update, namely parameters are configured for each layer in the deep neural network is adopted), for example, if the predicted value of the network is high, the weight vector is adjusted to be lower in prediction, and the adjustment is continuously carried out until the deep neural network can predict the really expected target value or the value which is very close to the really expected target value. Therefore, it is necessary to define in advance "how to compare the difference between the predicted value and the target value", which are loss functions (loss functions) or objective functions (objective functions), which are important equations for measuring the difference between the predicted value and the target value. Taking the loss function as an example, if the higher the output value (loss) of the loss function indicates the greater the difference, the training of the deep neural network becomes a process of reducing the loss as much as possible. Generally, the smaller the loss, the higher the training quality of the deep neural network, and the larger the loss, the lower the training quality of the deep neural network. Similarly, the smaller the loss fluctuation, the more stable the training; the larger the loss fluctuation, the more unstable the training.
As shown in fig. 2, the present embodiment provides a system architecture 100. In fig. 2, a data acquisition device 170 is used to acquire training data. For example, for the code migration method of the model according to the embodiment of the present application, the type of the training data is related to the task type of the model, for example, the model is an image processing model, and the training data may include a training image and a true value (ground true) corresponding to the training image. For example, if the task of the model is an image classification task, the true value corresponding to the training image may be a classification result corresponding to the training image, and the classification result of the training image may be a result manually pre-labeled.
After the training data is collected, the data collection device 170 stores the training data in the database 130, and the training device 120 trains the target model/rule 101 based on the training data maintained in the database 130. In the case that the deep learning framework adopted by the training code of the target model/rule 101 is different from the deep learning framework supported by the system architecture 100, the training device 120 may convert the training code into the training code under the deep learning framework supported by the system architecture 100, and then train the target model/rule 101 with the converted training code based on the training data maintained in the database 130. Alternatively, the training code may be converted into the training code under the deep learning framework supported by the system architecture 100 in advance, and then the training device 120 may train the training code after conversion based on the training data maintained in the database 130 to obtain the target model/rule 101
The following describes that the training device 120 obtains the target model/rule 101 based on the training data, and the training device 120 processes the input raw data and compares the output value with the target value until the difference between the output value of the training device 120 and the target value is smaller than a certain threshold, thereby completing the training of the target model/rule 101.
The target model/rule 101 in the embodiment of the present application may specifically be a neural network model. Such as a convolutional neural network or a residual network. It should be noted that, in practical applications, the training data maintained in the database 130 may not necessarily all come from the acquisition of the data acquisition device 170, and may also be received from other devices. It should be noted that, the training device 120 does not necessarily perform the training of the target model/rule 101 based on the training data maintained by the database 130, and may also obtain the training data from the cloud or other places for performing the model training.
The target model/rule 101 obtained by training according to the training device 120 may be applied to different systems or devices, for example, the execution device 110 shown in fig. 2, where the execution device 110 may be a terminal, such as a mobile phone terminal, a tablet computer, a notebook computer, an Augmented Reality (AR) AR/Virtual Reality (VR), a vehicle-mounted terminal, or a server or a cloud. In fig. 2, the execution device 110 configures an input/output (I/O) interface 112 for data interaction with an external device, and a user can input data to the I/O interface 112 through the client device 140.
In the process that the execution device 110 preprocesses the input data or in the process that the calculation module 111 of the execution device 110 executes the calculation or other related processes, the execution device 110 may call the data, the code, and the like in the data storage system 150 for corresponding processes, and may store the data, the instruction, and the like obtained by corresponding processes in the data storage system 150.
Finally, the I/O interface 112 returns the processing result, such as the processing result of the data obtained as described above, to the client device 140, thereby providing it to the user.
The deep learning framework supported by the training device 120 and the deep learning framework supported by the execution device 110 may or may not be the same. In the case that the deep learning framework supported by the training device 120 is different from the deep learning framework supported by the execution device 110, the target model/rule 101 trained by the training device 120 may be converted into the target model/rule under the deep learning framework supported by the execution device 110 by using the code migration method of the model in the embodiment of the present application.
It is noted that the training device 120 may generate corresponding goal models/rules 101 for different goals or different tasks based on different training data, and the corresponding goal models/rules 101 may be used to achieve the goals or complete the tasks, thereby providing the user with the desired results.
In the case shown in fig. 2, the user may manually give the input data, which may be operated through an interface provided by the I/O interface 112. Alternatively, the client device 140 may automatically send the input data to the I/O interface 112, and if the client device 140 is required to automatically send the input data to obtain authorization from the user, the user may set the corresponding permissions in the client device 140. The user can view the result output by the execution device 110 at the client device 140, and the specific presentation form can be display, sound, action, and the like. The client device 140 may also serve as a data collection terminal, collecting input data of the input I/O interface 112 and output results of the output I/O interface 112 as new sample data, and storing the new sample data in the database 130. Of course, the input data inputted to the I/O interface 112 and the output result outputted from the I/O interface 112 as shown in the figure may be directly stored in the database 130 as new sample data by the I/O interface 112 without being collected by the client device 140.
It should be noted that fig. 2 is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the position relationship between the devices, modules, and the like shown in the diagram does not constitute any limitation, for example, in fig. 2, the data storage system 150 is an external memory with respect to the execution device 110, and in other cases, the data storage system 150 may also be disposed in the execution device 110.
As shown in fig. 2, the target model/rule 101 is obtained by training according to the training device 120, and the target model/rule 101 may be a neural network model in the embodiment of the present application.
The deep learning frameworks used by different hardware platforms may be different, and the codes of the models built based on different deep learning frameworks have different expressions. Codes constructed based on one deep learning framework cannot be directly adapted to another deep learning framework. Moreover, because different hardware platforms support different grammars, interfaces and the like, even if different hardware platforms use the same deep learning framework, the code constructed on one hardware platform cannot be directly adapted to another hardware platform. Therefore, during the cross-platform migration of the code of the model, the code of the model needs to be converted to be adapted to the target hardware platform.
The existing code migration tool of the model has poor generalization capability and can only support migration in partial scenes, especially in the cross-platform migration process of training codes, a user is usually required to have algorithm development experience so as to modify the codes, the range of the codes to be modified is large, the model migration efficiency is seriously reduced, and the user experience is influenced.
The embodiment of the application provides a code migration method of a model, which is beneficial to realizing the automatic migration of the code of the model across hardware platforms and improving the code migration efficiency of the model.
FIG. 3 is a schematic diagram illustrating a code migration tool of a model provided by an embodiment of the present application.
The code migration tool 300 includes an Integrated Development Environment (IDE) tool 310 and a Command Line (CL) tool 320. IDE tool 310 invokes CL tool 320 to implement the code migration of the model.
Specifically, IDE tool 310 includes an interaction module 311 and a control layer 312.
The interaction module 311 is used for implementing interaction with a user.
In particular, the interaction module 311 may enable interaction with a user through an IDE interface, i.e., implement IDE interface logic of the code migration tool.
For example, the interaction module 311 may obtain the user input through a pop-up dialog box, an option of providing an input path, or an operation interface such as button click.
Therefore, the requirement on algorithm development experience of a user can be reduced, and the user experience is improved.
The control layer 312 is used to interface the IDE interface with the CL implementation, implementing IDE interface independent business logic.
The interaction module 311 invokes an interface of the control layer 312 to translate the user input into commands for the CL tool.
Specifically, the control layer 312 is used to implement the splicing and issuing of commands, and invoke the CL tool 320 to start the conversion process.
Illustratively, the control layer may also be referred to as a traffic handling layer.
For example, the control layer 312 may be a commonlib msft module 312 shown in FIG. 3.
Wherein the IDE interface can be understood as a visual encapsulation of the CL tool.
In other words, the user may implement graphical interaction through the IDE interface and then implement automatic command concatenation and issuing through the commonlib msft module 312, invoking the CL tool 320.
It should be noted that, without the IDE tool 310, the CL tool 320 may be used by the user to enter parameters and then issue commands to implement the code migration operation of the model.
The various modules in fig. 3 may also be provided as plug-and-play plug-ins. For example, the interaction module 311 may also be configured as an interaction plug-in (plugin).
The CL tool 320 includes a flow logic module 321 and at least one framework processing module 322.
In particular, the flow logic module 321 is configured to determine the execution of the method 400 in the following. For example, the flow logic module 321 is used to determine to perform specific steps in the method 400. As another example, the flow logic module 321 is used to determine the order of execution of the steps in the method 400. As another example, the flow logic module 321 is used to determine the execution conditions of the steps in the method 400, and the like.
Illustratively, as shown in FIG. 3, the flow logic module 321 may employ a mindstudio frame migration (msft) module 321.
The processing module 322 of the at least one framework is used for converting the code constructed by the at least one framework.
As shown in fig. 3, the at least one frame includes TensorFlow and PyTorch. The TensorFlow processing module is msft-TensorFlow module 322 in FIG. 3. The processing module for PyTorch is the msft-PyTorch module 322 in FIG. 3.
Optionally, the processing module 322 of the at least one frame further comprises a configuration information module 323 of the frame. The configuration information module 323 of the framework is used for setting configuration information of the framework.
Illustratively, the configuration information includes information such as optional function or API correspondence.
For example, the optional functions may include: the user defines the function of the conversion rule and the like. The configuration information module 323 can set the user to automatically start with the function of converting the rule, and provide the user with the function of customizing the conversion rule. In this way, the user can customize the conversion rules.
For example, as shown in FIG. 3, the configuration information module 323 of the msft-TensorFlow module 322 is a msft-config-TensorFlow module. The configuration information module 323 of the msft-PyTorch module 322 is a msft-config-PyTorch module 323.
The code migration method of the model in the embodiment of the present application is described in detail below with reference to fig. 4.
FIG. 4 illustrates a code migration method 400 for a model provided by an embodiment of the present application. The method shown in fig. 4 may be executed by a computing apparatus, which may be a cloud service device, or a terminal device, for example, a computer, a server, a mobile phone, a camera, a vehicle, an unmanned aerial vehicle, or a robot, or a system composed of a cloud service device and a terminal device.
Illustratively, the method 400 may be performed by the training device 120 or the computing device 110 shown in FIG. 2. Further, the method 400 may be performed by the CL tool in fig. 3. For example, if the first hardware employs a TensorFlow framework, the method 400 may be performed by the flow logic module 321 in the CL tool and the processing module 322 of the TensorFlow.
The method 400 includes steps S401 to S404. The following describes steps S401 to S404 in detail.
S401, a first code of the model is obtained. The first code is constructed based on a first hardware device.
The syntax, API, etc. supported by different hardware devices are different, resulting in different expressions of the code built on different hardware devices.
When migrating a first code constructed on a first hardware device to a second hardware device, the first code needs to be converted into a representation form supported by the second hardware device.
The deep learning framework employed by the first hardware device and the second hardware device may be the same or different.
The deep learning framework in the embodiment of the present application may be an existing deep learning framework. For example, the deep learning framework may be a tensrflow framework, a pytorreh framework, an open neural network exchange (ONNX) framework, or a fast feature-embedded convolution structure (buffer) framework, etc.
A hardware device may also be referred to as a hardware platform. Illustratively, the hardware device may employ at least one of the following accelerators: a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a neural Network Processor (NPU), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), and the like. The type of the hardware device is not limited in the embodiments of the present application, as long as the hardware device can execute the code of the model to implement data processing.
Illustratively, the hardware device may be the training device 120 in fig. 2, or the hardware device may also be the execution device 110 in fig. 2.
The model in the embodiments of the present application may be a deep learning model, for example, a neural network model.
The first code of the model may comprise a training code of the model or a prediction code of the model.
The training code of the model realizes the training process of the model. The predictive code of the model is the reasoning process for the model. That is, the predictive code of the model may be the model itself.
That is to say, the scheme of the embodiment of the present application may be used to implement migration of training code of the model, and may also implement migration of the model itself.
Illustratively, the code of the model may be represented in the form of a script. For example, the training code of the model may be a training script of the model. The training script of the model comprises a plurality of parts such as preprocessing, backbone (backbone), post-processing and training optimizer.
Illustratively, the first code of the model may be user input, e.g., the user may input the first code of the model through the interaction module 311 in fig. 3. Alternatively, the first code of the model may also be received from other devices. The embodiment of the present application does not limit a specific obtaining manner of the first code.
S402, converting the first code into a syntax tree, and analyzing the syntax tree to obtain a node to be converted.
The nodes to be converted include at least one of: an Application Programming Interface (API) to be converted or an operator to be converted.
One or more nodes to be converted in the syntax tree may be provided, which is not limited in the embodiment of the present application.
Optionally, the nodes to be converted include nodes containing target keywords. The target key is used to indicate the first hardware device.
The target keyword is used to indicate the first hardware device, and may be understood as indicating the first hardware device itself, or may be understood as indicating the first deep learning framework adopted by the first hardware device.
The number of the target keywords may be one or more.
The node containing the target key may also be understood as the node associated with the first hardware device.
The nodes in the first code may be divided into nodes associated with the first hardware device and nodes not associated with the first hardware device.
A node associated with a first hardware device may be understood as a node specific to the first hardware device.
For example, one node in the first code is an addition operator, which is used to implement an addition operation between two values. The node is a node unrelated to the first hardware device.
For another example, the first deep learning frame adopted by the first hardware device is a tensrflow frame, and one node in the first code is a convolution operator under the tensrflow frame, so that the node is a node related to the first hardware device.
It should be understood that the operator in the first code in the embodiment of the present application may be an existing operator, such as a convolution operator or a full join operator. Or, the operator in the first code may also be an operator constructed by the user, which is not limited in this embodiment of the present application.
The following exemplifies the nodes containing the keywords by taking the operators to be converted as examples.
If the node to be converted is the operator to be converted, the node containing the target keyword may also be referred to as the operator containing the target keyword.
For example, the first hardware device adopts a pytorreh framework, the first code of the model is constructed based on the pytorreh framework, an operator specific to the pytorreh framework is usually expressed in a form of torch.x, and the target keyword may be torch, so that the operator in the form of torch.x in the first code of the model is the operator to be converted. For another example, the first hardware device uses a tensrflow framework, the first code of the model is constructed based on the tensrflow framework, an operator specific to the tensrflow framework is usually expressed in the form of tf.x, and the target keyword may be tf, so that the operator in the form of tf.x in the first code of the model is the operator to be converted.
Illustratively, the target keywords include at least one of: the first target keyword or the second target keyword. Wherein the first target keyword may be determined by the first mapping relationship. The first mapping relation is used for indicating the corresponding relation between the plurality of first candidate keywords and the plurality of hardware devices. The first target keyword belongs to a plurality of first candidate keywords. The first hardware device belongs to a plurality of hardware devices. The second target keyword may be determined by the second mapping relationship. The second mapping is used for indicating the corresponding relation between the plurality of second candidate keywords and the plurality of deep learning frames. The second target keyword belongs to a plurality of second candidate keywords. The first deep learning framework belongs to a plurality of deep learning frameworks.
In other words, a first candidate keyword corresponding to the first deep learning frame, that is, a first target keyword, among the plurality of first candidate keywords is determined according to the first mapping relationship. And determining a second candidate keyword corresponding to the second deep learning frame in the plurality of second candidate keywords according to the second mapping relation, namely a second target keyword.
As described above, the target keyword may be one or more. One deep learning frame may correspond to one second target keyword or may correspond to a plurality of second target keywords. One hardware device may correspond to one first target keyword, or may correspond to a plurality of first target keywords. The embodiment of the present application does not limit this.
Alternatively, the target keyword may be set manually.
In this case, the first mapping relationship may be updated based on the target key and the first hardware device. That is, the target keyword is added to the first candidate keyword as the first candidate keyword corresponding to the first hardware device.
Therefore, the first mapping relation is enriched, and the code migration efficiency of the model is further improved.
Alternatively, in this case, the second mapping relationship may be updated based on the target keyword and the first deep learning framework. Namely, the target keyword is added to the second candidate keyword to be used as the second candidate keyword corresponding to the first deep learning frame.
Therefore, the second mapping relation is enriched, and the code migration efficiency of the model is further improved.
The first mapping relationship may be stored in a database of the expert system.
Illustratively, the expert system may update the first mapping relationship by scanning the network.
Specifically, the expert system may convert different models by using the scheme of the embodiment of the present application, use a target keyword used in the conversion process as a keyword corresponding to the current hardware device, and update the first mapping relationship based on the target keyword, thereby improving the generalization ability of the expert system.
For example, the first mapping relation does not include the current hardware device, and the target keyword is set manually. In this case, the target keyword may be used as a keyword corresponding to the current hardware device, and the first mapping relationship may be updated based on the target keyword.
The second mapping relationship may be stored in a database of the expert system.
Illustratively, the expert system may update the second mapping relationship by scanning the network. For specific description, the first mapping relationship may be referred to, and only the hardware device therein needs to be replaced by a deep learning framework, which is not described herein again.
In the embodiment of the present application, a node containing a target keyword is used as a node to be converted, that is, a node related to a first hardware device is used as a node to be converted, so that code migration of a model between different hardware devices is realized, unnecessary conversion can be reduced as much as possible, for example, a node unrelated to a hardware device may not be converted, and thus conversion efficiency of the model is improved.
It should be understood that the above is only an example, and the nodes to be converted may also be set in other manners, for example, all nodes in the first code may also be taken as the nodes to be converted in step S402.
The syntax tree refers to a tree structure containing syntax information of the code.
Illustratively, the first code of the model may be structured into a tree structure, i.e., a syntax tree, by means of a Concrete Syntax Tree (CST) or an Abstract Syntax Tree (AST). The nodes to be converted can be located according to the syntax tree structure.
And S403, replacing the nodes to be converted with one or more target nodes corresponding to the nodes to be converted to obtain the replaced syntax tree. And the target node is a node supported by the second hardware equipment.
S404, generating a second code based on the replaced syntax tree.
In the embodiment of the application, the syntax information in the code can be obtained by performing semantic analysis by using the syntax tree, for example, the nesting relation between operators is obtained, which is beneficial to accurately identifying each node in the first code, improving the positioning accuracy of the node to be converted, and improving the accuracy of the code after migration. Meanwhile, nodes to be converted are replaced on the syntax tree, second codes are generated based on the replaced syntax tree, and the structure of the syntax tree is utilized to reduce the process of human participation, for example, the processes of semantic analysis on the codes and designation of the nodes to be converted are reduced, so that the complete migration of the codes of the model is facilitated, and the efficiency of code migration across hardware platforms can be improved.
Optionally, one or more target nodes corresponding to a first node in the nodes to be converted are multiple nodes supported by the second hardware device, and a function implemented by the first node is the same as a function implemented by a combination of the multiple nodes. The first node does not support conversion using the conversion rule.
The first node may be understood as a type of node, and nodes that do not support conversion using the conversion rule may be understood as the first node.
Illustratively, the conversion rule includes a third mapping relationship. The third mapping relationship is used for indicating a corresponding relationship between the nodes supported by the first hardware device and the nodes supported by the second hardware device. The functions realized by the nodes having the correspondence relationship in the two hardware devices are the same. In this case, a node that does not support conversion using the conversion rule may be understood as a node that is not included in the third mapping relationship.
For the first node, an equivalent replacement manner may be utilized to search for an appropriate replacement item, i.e., a combination of a plurality of nodes consistent with the function implemented by the first node, among a plurality of candidate nodes supported by the second device. The first node has an equivalent correspondence with the combination of the plurality of nodes.
For example, the first node includes operator a supported by the first device, and the conversion rule does not support conversion of operator a, in which case, if operators B and C supported by the second device can implement the function of operator a, operator a may be converted into a combination of operator B and operator C.
The conversion rules may be preset or may be user-defined. The embodiment of the present application does not limit this.
In the embodiment of the application, for a first node which does not support conversion by using a conversion rule, the first node can be converted into a combination of a plurality of nodes supported by a second device, and the combination of the plurality of nodes can be operated on the second device to realize a function consistent with that of the first node, thereby being beneficial to improving the conversion capability of a model, improving the performance of converted codes, reducing the number and range of codes to be modified, having less requirements on algorithm development experience of a user, being capable of reducing labor cost and realizing cross-platform migration.
Further, the conversion rule is updated based on the plurality of target nodes corresponding to the first node. Therefore, the later first node can support conversion based on the updated conversion rule, and the conversion efficiency of the model is improved.
Illustratively, the conversion rules may be stored in a database of the expert system. The expert system may update the conversion rules through a scan.
Specifically, the expert system may convert different models by using the scheme of the embodiment of the present application, and update the conversion rule based on a combination of the first node and a plurality of nodes supported by the current hardware device corresponding to the first node in the conversion process, thereby improving the processing capability of the expert system.
Optionally, the one or more target nodes corresponding to the second node in the nodes to be converted include one or more nodes of the second hardware device obtained by using the conversion rule. The second node is a node that supports conversion using the conversion rule.
The second node may be understood as a type of node, and nodes supporting conversion using the conversion rule may be referred to as second nodes. That is, among the nodes to be converted, the nodes other than the first node are the second nodes. Or, in the nodes to be converted, the nodes other than the second node are the first nodes.
For the second node, the conversion may be performed directly based on the conversion rule.
Illustratively, the conversion rule includes a third mapping relationship. The third mapping relationship is used for indicating a corresponding relationship between the nodes supported by the first hardware device and the nodes supported by the second hardware device.
One node supported by the first hardware device may correspond to one node supported by the second hardware device, or one node supported by the first hardware device may correspond to a combination of a plurality of nodes supported by the second hardware device. The embodiment of the present application does not limit this.
For example, the second node includes an operator a supported by the first hardware device, and it is determined according to the conversion rule that there is a correspondence between the operator a supported by the first hardware device and the operator a ' supported by the second hardware device, that is, the operator a ' can implement the function of the operator a, in which case, the operator a can be converted into the operator a ' based on the conversion rule. For another example, the second node includes an operator a supported by the first hardware device, and it is determined according to the conversion rule that there is a correspondence between the operator a supported by the first hardware device and a combination of the operator B and the operator C supported by the second hardware device, that is, the combination of the operator B and the operator C can implement a function of the operator a, that is, a single operator capable of implementing the operator a may not be supported on the second hardware device, and in this case, the operator a may be converted into the combination of the operator B and the operator C based on the conversion rule.
Illustratively, the conversion rules may be stored in a database of the expert system. The conversion rules are searched in the database of the expert system to implement the grammar substitution.
In the embodiment of the application, for the second node supporting the conversion by using the conversion rule, the conversion can be directly performed based on the conversion rule, which is beneficial to improving the conversion efficiency of the model.
Optionally, the method 400 further comprises: and providing alarm information, wherein the alarm information is used for indicating the first node.
Therefore, the nodes with possible problems can be prompted to the user, and the user experience is improved.
Optionally, the method 400 further comprises step S405.
S405, running the second code of the model on the second hardware device to obtain the operation result of the intermediate variable in the running process of the second code, and adjusting the second code according to the operation result of the intermediate variable to obtain the third code of the model. The operation result of the intermediate variable includes a value of the intermediate variable or a data type of the intermediate variable.
The second code of the model is run on the second hardware device, which can also be understood as the second code that controls the second hardware device to execute the model.
The code is in a static expression form, and state information in the running process, such as an operation result of an intermediate variable, cannot be obtained through static syntax analysis. If the second code is directly deployed on the second hardware device for use, an erroneous operation result may be caused. In the scheme of the embodiment of the application, the running process is dynamically analyzed to obtain the operation result of the intermediate variable in the running process, the converted second code is further adjusted, and the accuracy of the third code is improved.
Optionally, adjusting the second code according to the operation result of the intermediate variable to obtain a third code of the model, including: and under the condition that the second hardware equipment does not support the operation result of the intermediate variable, adjusting the statement related to the operation result of the intermediate variable in the second code to obtain a third code.
That is, in the case that the second hardware device does not support the operation result of the intermediate variable, the statement related to the operation result of the intermediate variable in the second code is adjusted, so that the operation result of the intermediate variable in the operation process of the adjusted second code can be supported by the second hardware device.
The specific adjustment mode can be set according to the requirement. For example, adjusting the statement in the second code that is related to the intermediate variable may include at least one of: adding a statement related to the operation result of the intermediate variable in the second code, and deleting or modifying the statement related to the operation result of the intermediate variable in the second code.
The second hardware device does not support the operation result of the intermediate variable, and may include at least one of: the second hardware device does not support the data type of the intermediate variable, or the value of the intermediate variable exceeds a range supported by the second hardware device.
For example, during the second code run, the data type of the intermediate variable is fp64, and the called API needs to support fp 64. While the data type supported by the second hardware device is fp32, in which case the statements in the second code relating to the data type of the intermediate variable may be modified to convert the data type of the intermediate variable to fp 32.
For another example, during the execution of the second code, the value of the intermediate variable is outside the range supported by the second hardware device, in which case the statement in the second code relating to the value of the intermediate variable may be modified so that the value of the intermediate variable falls within the range supported by the second hardware device.
In the case where the method 400 includes step S405, the method 400 may further include: and outputting the third code.
In the case where the method 400 does not include step S405, the method 400 may further include: and outputting the second code.
Optionally, the method 400 further comprises step S406.
And S406, outputting an analysis report, wherein the analysis report is used for indicating the node to be converted and one or more target nodes corresponding to the node to be converted.
Illustratively, the node to be converted and the one or more target nodes to which the node to be converted corresponds include at least one of: the first node and one or more target nodes corresponding to the first node, or the second node and one or more target nodes corresponding to the second node.
That is, the analysis report may be used to indicate a conversion operation between the node to be converted and one or more target nodes corresponding to the node to be converted.
Therefore, information of conversion operation can be provided for the user, the user can judge the conversion condition conveniently, for example, the user can judge whether to need further modification based on the information of the conversion operation, and the user experience is improved.
The apparatus of the embodiment of the present application will be described with reference to fig. 5 to 6. It should be understood that the apparatus described below is capable of performing the method of the foregoing embodiments of the present application, and in order to avoid unnecessary repetition, the repeated description is appropriately omitted below when describing the apparatus of the embodiments of the present application.
FIG. 5 is a schematic block diagram of a code migration apparatus of a model of an embodiment of the present application. The code migration apparatus 4000 of the model shown in fig. 5 includes an acquisition unit 4010 and a processing unit 4020.
The obtaining unit 4010 and the processing unit 4020 may be configured to execute the code migration method 400 of the model according to the embodiment of the present application.
Specifically, the obtaining unit 4010 is configured to obtain a first code of the model, where the first code is constructed based on the first hardware device.
The processing unit 4020 is configured to: converting the first code into a syntax tree, and analyzing the syntax tree to obtain nodes to be converted, wherein the nodes to be converted comprise at least one of the following items: API or operator to be converted; replacing the node to be converted into one or more target nodes corresponding to the node to be converted so as to obtain a replaced syntax tree, wherein the target nodes are nodes supported by second hardware equipment; generating a second code based on the replaced syntax tree.
Optionally, as an embodiment, the one or more target nodes corresponding to the first node in the nodes to be converted include a plurality of nodes supported by the second hardware device, a function implemented by the first node is the same as a function implemented by a combination of the plurality of nodes, and the first node does not support conversion using the conversion rule.
Optionally, as an embodiment, the processing unit 4020 is further configured to: running the second code on the second hardware equipment to obtain an operation result of an intermediate variable in the running process of the second code, wherein the operation result of the intermediate variable comprises a value of the intermediate variable or a data type of the intermediate variable; and adjusting the second code according to the operation result of the intermediate variable to obtain the third code.
Optionally, as an embodiment, the processing unit 4020 is specifically configured to: and under the condition that the second hardware equipment does not support the operation result of the intermediate variable, adjusting the statement related to the operation result of the intermediate variable in the second code to obtain the third code.
Optionally, as an embodiment, the node to be converted includes a node including a target key, and the target key is used to indicate the first hardware device.
Optionally, as an embodiment, one or more target nodes corresponding to a second node in the nodes to be converted are one or more nodes supported by the second hardware device obtained by conversion using the conversion rule, and the second node supports conversion using the conversion rule.
Optionally, as an embodiment, the apparatus further includes: an output unit 4030, configured to output an analysis report, where the analysis report is used to indicate a node to be converted and one or more target nodes corresponding to the node to be converted.
It should be noted that the apparatus 4000 is implemented as a functional unit. The term "unit" herein may be implemented in software and/or hardware, and is not particularly limited thereto.
For example, a "unit" may be a software program, a hardware circuit, or a combination of both that implement the above-described functions. The hardware circuitry may include an Application Specific Integrated Circuit (ASIC), an electronic circuit, a processor (e.g., a shared processor, a dedicated processor, or a group of processors) and memory that execute one or more software or firmware programs, a combinational logic circuit, and/or other suitable components that support the described functionality.
Thus, the units of each example described in the embodiments of the present application can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
Fig. 6 is a hardware configuration diagram of a code migration apparatus of a model provided in an embodiment of the present application. Code migration apparatus 6000 of the model shown in fig. 6 (the apparatus 6000 may specifically be a computer device) includes a memory 6001, a processor 6002, a communication interface 6003, and a bus 6004. The memory 6001, the processor 6002, and the communication interface 6003 are connected to each other in a communication manner via a bus 6004.
The memory 6001 can be a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a Random Access Memory (RAM). The memory 6001 may store programs which, when executed by the processor 6002, the processor 6002 is configured to perform the steps of the code migration method of the model of an embodiment of the application. In particular, the processor 6002 may perform the method 400 above.
The processor 6002 may adopt a general-purpose Central Processing Unit (CPU), a microprocessor, an Application Specific Integrated Circuit (ASIC), a Graphics Processing Unit (GPU) or one or more integrated circuits, and is configured to execute related programs to implement the code migration method of the model of the embodiment of the present application.
The processor 6002 could also be an integrated circuit chip that has signal processing capabilities. In implementation, the various steps of the code migration method of the model of the present application may be performed by instructions in the form of software or integrated logic circuits of hardware in the processor 6002.
The processor 6002 could also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, or discrete hardware component. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 6001, and the processor 6002 reads information in the memory 6001, and completes the functions to be executed by the cells included in the apparatus shown in fig. 5 with the hardware thereof, or executes the code migration method of the model of the method embodiment of the present application.
The communication interface 6003 enables communications between the apparatus 6000 and other devices or communication networks using transceiver means such as, but not limited to, a transceiver. For example, the first code may be acquired through the communication interface 6003.
The bus 6004 may include paths that convey information between various components of the device 6000 (e.g., memory 6001, processor 6002, communication interface 6003).
It should be noted that although the above-described apparatus 6000 shows only memories, processors, and communication interfaces, in a specific implementation, those skilled in the art will appreciate that the apparatus 6000 may also include other devices necessary for normal operation. Also, the apparatus 6000 may also include hardware components for performing other additional functions, as may be appreciated by those skilled in the art, according to particular needs. Furthermore, it should be understood by those skilled in the art that the apparatus 6000 may also include only the devices necessary to implement the embodiments of the present application, and not necessarily all of the devices shown in fig. 6.
Embodiments of the present application also provide a computer-readable storage medium storing program code for execution by a device, the program code including a code migration method for executing a model in an embodiment of the present application.
Embodiments of the present application also provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the code migration method of the model in the embodiments of the present application.
The embodiment of the present application further provides a chip, where the chip includes a processor and a data interface, and the processor reads an instruction stored in a memory through the data interface to execute the code migration method of the model in the embodiment of the present application.
Optionally, as an implementation manner, the chip may further include a memory, where the memory stores instructions, and the processor is configured to execute the instructions stored on the memory, and when the instructions are executed, the processor is configured to execute the code migration method of the model in the embodiment of the present application.
The chip may be specifically an FPGA or an ASIC.
It should be understood that the processor in the embodiments of the present application may be a Central Processing Unit (CPU), and the processor may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It will also be appreciated that the memory in the embodiments of the subject application can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. The non-volatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example, but not limitation, many forms of Random Access Memory (RAM) are available, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), synchlink DRAM (SLDRAM), and direct bus RAM (DR RAM).
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. The procedures or functions according to the embodiments of the present application are wholly or partially generated when the computer instructions or the computer program are loaded or executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more collections of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
It should be understood that the term "and/or" herein is merely one type of association relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. In addition, the "/" in this document generally indicates that the former and latter associated objects are in an "or" relationship, but may also indicate an "and/or" relationship, which may be understood with particular reference to the former and latter text.
In the present application, "at least one" means one or more, "a plurality" means two or more. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or multiple.
It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (18)

1. A method for code migration of a model, comprising:
obtaining a first code of a model, wherein the first code is constructed based on a first hardware device;
converting the first code into a syntax tree, and analyzing the syntax tree to obtain nodes to be converted, wherein the nodes to be converted comprise at least one of the following items: an application program interface API to be converted or an operator to be converted;
replacing the node to be converted into one or more target nodes corresponding to the node to be converted so as to obtain a replaced syntax tree, wherein the target nodes are nodes supported by second hardware equipment;
generating a second code based on the replaced syntax tree.
2. The method according to claim 1, wherein the one or more target nodes corresponding to a first node among the nodes to be converted include a plurality of nodes supported by a second hardware device, the function implemented by the first node is the same as the function implemented by the combination of the plurality of nodes, and the first node does not support conversion using a conversion rule.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
running the second code on the second hardware device to obtain an operation result of an intermediate variable in the running process of the second code, wherein the operation result of the intermediate variable comprises a value of the intermediate variable or a data type of the intermediate variable;
and adjusting the second code according to the operation result of the intermediate variable to obtain a third code.
4. The method of claim 3, wherein the adjusting the second code according to the operation result of the intermediate variable to obtain a third code comprises:
and under the condition that the second hardware equipment does not support the operation result of the intermediate variable, adjusting a statement related to the operation result of the intermediate variable in the second code to obtain a third code.
5. The method of any of claims 1 to 4, wherein the nodes to be converted comprise nodes that contain a target key, the target key indicating the first hardware device.
6. The method according to any one of claims 1 to 5, wherein one or more target nodes corresponding to a second node among the nodes to be converted are one or more nodes supported by the second hardware device, which are obtained by conversion using a conversion rule, and the second node supports conversion using the conversion rule.
7. The method according to any one of claims 1 to 6, further comprising:
outputting an analysis report, wherein the analysis report is used for indicating the node to be converted and one or more target nodes corresponding to the node to be converted.
8. An apparatus for code migration of a model, comprising:
an acquisition unit configured to acquire a first code of a model, the first code being constructed based on a first hardware device;
a processing unit to:
converting the first code into a syntax tree, and analyzing the syntax tree to obtain nodes to be converted, wherein the nodes to be converted comprise at least one of the following items: an application program interface API to be converted or an operator to be converted;
replacing the node to be converted with one or more target nodes corresponding to the node to be converted to obtain a replaced syntax tree, wherein the target nodes are nodes supported by second hardware equipment;
generating a second code based on the replaced syntax tree.
9. The apparatus according to claim 8, wherein the one or more target nodes corresponding to a first node among the nodes to be converted include a plurality of nodes supported by a second hardware device, the function implemented by the first node is the same as the function implemented by the combination of the plurality of nodes, and the first node does not support conversion using a conversion rule.
10. The apparatus according to claim 8 or 9, wherein the processing unit is further configured to:
running the second code on the second hardware device to obtain an operation result of an intermediate variable in the running process of the second code, wherein the operation result of the intermediate variable comprises a value of the intermediate variable or a data type of the intermediate variable;
and adjusting the second code according to the operation result of the intermediate variable to obtain a third code.
11. The apparatus according to claim 10, wherein the processing unit is specifically configured to:
and under the condition that the second hardware equipment does not support the operation result of the intermediate variable, adjusting a statement related to the operation result of the intermediate variable in the second code to obtain a third code.
12. The apparatus of any of claims 8 to 11, wherein the nodes to be converted comprise nodes that contain a target key, the target key indicating the first hardware device.
13. The apparatus according to any one of claims 8 to 12, wherein one or more target nodes corresponding to a second node among the nodes to be converted are one or more nodes supported by the second hardware device, which are obtained by conversion using a conversion rule, and the second node supports conversion using the conversion rule.
14. The apparatus of any one of claims 8 to 13, further comprising:
an output unit, configured to output an analysis report, where the analysis report is used to indicate the node to be converted and one or more target nodes corresponding to the node to be converted.
15. A code migration apparatus for a model, comprising a processor and a memory, the memory for storing program instructions, the processor for calling the program instructions to perform the method of any of claims 1 to 7.
16. A computer-readable storage medium for storing program code for execution by a device, the program code comprising instructions for performing the method of any of claims 1-7.
17. A computer program product comprising instructions for causing a computer to perform the method according to any one of claims 1 to 7 when the computer program product is run on the computer.
18. A chip comprising a processor and a data interface, the processor reading instructions stored on a memory through the data interface to perform the method of any one of claims 1 to 7.
CN202111122126.2A 2021-09-24 2021-09-24 Code migration method and device of model Pending CN114942782A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111122126.2A CN114942782A (en) 2021-09-24 2021-09-24 Code migration method and device of model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111122126.2A CN114942782A (en) 2021-09-24 2021-09-24 Code migration method and device of model

Publications (1)

Publication Number Publication Date
CN114942782A true CN114942782A (en) 2022-08-26

Family

ID=82906147

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111122126.2A Pending CN114942782A (en) 2021-09-24 2021-09-24 Code migration method and device of model

Country Status (1)

Country Link
CN (1) CN114942782A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101208660A (en) * 2005-06-27 2008-06-25 奎朴兹有限公司 Code transformation
US20140282444A1 (en) * 2013-03-15 2014-09-18 ArtinSoft Corporation Programming language transformations with abstract syntax tree extensions
US20180136912A1 (en) * 2016-11-17 2018-05-17 The Mathworks, Inc. Systems and methods for automatically generating code for deep learning systems
CN111625224A (en) * 2020-05-28 2020-09-04 北京百度网讯科技有限公司 Code generation method, device, equipment and storage medium
CN111752571A (en) * 2020-06-29 2020-10-09 广州华多网络科技有限公司 Program upgrading method, device, equipment and storage medium
CN112819153A (en) * 2020-12-31 2021-05-18 杭州海康威视数字技术股份有限公司 Model transformation method and device
CN113283613A (en) * 2021-07-23 2021-08-20 上海燧原科技有限公司 Deep learning model generation method, optimization method, device, equipment and medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101208660A (en) * 2005-06-27 2008-06-25 奎朴兹有限公司 Code transformation
US20140282444A1 (en) * 2013-03-15 2014-09-18 ArtinSoft Corporation Programming language transformations with abstract syntax tree extensions
US20180136912A1 (en) * 2016-11-17 2018-05-17 The Mathworks, Inc. Systems and methods for automatically generating code for deep learning systems
CN111625224A (en) * 2020-05-28 2020-09-04 北京百度网讯科技有限公司 Code generation method, device, equipment and storage medium
CN111752571A (en) * 2020-06-29 2020-10-09 广州华多网络科技有限公司 Program upgrading method, device, equipment and storage medium
CN112819153A (en) * 2020-12-31 2021-05-18 杭州海康威视数字技术股份有限公司 Model transformation method and device
CN113283613A (en) * 2021-07-23 2021-08-20 上海燧原科技有限公司 Deep learning model generation method, optimization method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN110175671B (en) Neural network construction method, image processing method and device
CN111898635A (en) Neural network training method, data acquisition method and device
CN111368993B (en) Data processing method and related equipment
CN112580369B (en) Sentence repeating method, method and device for training sentence repeating model
CN112883149B (en) Natural language processing method and device
WO2022068623A1 (en) Model training method and related device
US20230095606A1 (en) Method for training classifier, and data processing method, system, and device
CN110263324A (en) Text handling method, model training method and device
CN111951805A (en) Text data processing method and device
CN113570029A (en) Method for obtaining neural network model, image processing method and device
CN114255361A (en) Neural network model training method, image processing method and device
CN111898636B (en) Data processing method and device
US20240135174A1 (en) Data processing method, and neural network model training method and apparatus
CN111368656A (en) Video content description method and video content description device
CN111428854A (en) Structure searching method and structure searching device
CN111340190A (en) Method and device for constructing network structure, and image generation method and device
CN114004383A (en) Training method of time series prediction model, time series prediction method and device
CN116432019A (en) Data processing method and related equipment
CN114861859A (en) Training method of neural network model, data processing method and device
CN111652349A (en) Neural network processing method and related equipment
CN116665114B (en) Multi-mode-based remote sensing scene identification method, system and medium
CN112070205A (en) Multi-loss model obtaining method and device
CN115795025A (en) Abstract generation method and related equipment thereof
CN114942782A (en) Code migration method and device of model
CN116258190A (en) Quantization method, quantization device and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220826

RJ01 Rejection of invention patent application after publication