CN112905189A

CN112905189A - Model translation method, device and computer readable storage medium

Info

Publication number: CN112905189A
Application number: CN202110209225.8A
Authority: CN
Inventors: 高威特; 张楠赓
Original assignee: Canaan Bright Sight Co Ltd
Current assignee: Canaan Bright Sight Co Ltd
Priority date: 2021-02-24
Filing date: 2021-02-24
Publication date: 2021-06-04
Also published as: WO2022179361A1

Abstract

The application provides a model translation method, a model translation device and a computer readable storage medium, wherein the method comprises the following steps: obtaining a model to be translated and a training data set; converting the neural network layer of the model to be translated into a target neural network layer supported by a target processor to obtain a target model; and training the target model according to the training data set to determine parameters of each target neural network layer in the target model. By the method, the model generated under the specific framework can be converted into the target model which can be run by the target processor, so that the implementation scene of the model is expanded.

Description

Model translation method, device and computer readable storage medium

Technical Field

The application belongs to the field of artificial intelligence, and particularly relates to a model translation method and device and a computer readable storage medium.

Background

This section is intended to provide a background or context to the embodiments of the application that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.

Engineers typically create neural network models using, for example, the tensrflow framework, and some use the Caffe framework or the Darknet framework.

However, in the prior art, the neural network models created under different frameworks can only run on a specific processor, in other words, the models trained based on these common frameworks may not run directly on other general-purpose processors, that is, the models and the processors have the problem of format incompatibility. Therefore, how to solve the problem of format incompatibility existing between the model and the processor becomes a technical problem to be solved in the field.

Disclosure of Invention

The embodiment of the application provides a model translation method, a model translation device and a computer readable storage medium. By using the model method and the device, the technical problem of format incompatibility between the model and the processor can be solved.

The following schemes are provided in the examples of the present application.

In a first aspect, a model translation method is provided, including: obtaining a model to be translated and a training data set, wherein the model to be translated comprises a plurality of neural network layers; converting the neural network layer of the model to be translated into a target neural network layer supported by a target processor to obtain a target model; and training the target model according to the training data set to determine parameters of each target neural network layer in the target model.

In one possible implementation, obtaining a model to be translated includes: receiving the program code of the model to be translated based on a programming interface, and determining the model to be translated; wherein the programming interface disables functions that are not supported by the target processor.

In one possible embodiment, converting the neural network layer of the model to be translated into a target neural network layer supported by a target processor includes: determining a layer mapping relation between a neural network layer in the model to be translated and a target neural network layer supported by a target processor based on an equivalent relation between a function of the neural network layer in the model to be translated and a function of the target neural network layer supported by the target processor; and equivalently converting the neural network layer in the model to be translated into a target neural network layer supported by the target processor based on the layer mapping relation.

In one possible embodiment, determining a layer mapping relationship between a neural network layer in the model to be translated and a target neural network layer supported by a target processor further includes: determining a layer mapping relation between a designated neural network layer in the model to be translated and a target neural network layer supported by a target processor based on the equivalent relation; and/or determining a layer mapping relation between a specified neural network layer in the model to be translated and a target neural network layer combination supported by the target processor based on the equivalent relation; and/or determining a layer mapping relation between a specified neural network layer combination in the model to be translated and a target neural network layer supported by the target processor based on the equivalent relation; and/or determining a layer mapping relation between the designated neural network layer combination in the model to be translated and the target neural network layer combination supported by the target processor based on the equivalent relation.

In one possible embodiment, the method further comprises: inputting sample data in the training data set into the target model to determine a numerical range of input values of at least one target neural network layer of the target model; and determining the quantization parameter of at least one target neural network layer in the target model according to the numerical range of the input value and a preset quantization rule.

In one possible embodiment, the method further comprises: acquiring an object to be processed of the target model; inputting the object to be processed into the target model to determine a numerical range of input values of at least one target neural network layer of the target model; and determining the quantization parameter of at least one target neural network layer in the target model according to the numerical range of the input value and a preset quantization rule.

In one possible embodiment, after determining the parameters of each target neural network layer in the target model, the method further includes: and generating an output code according to each target neural network layer of the target model and the corresponding parameters.

In one possible embodiment, the model to be translated is in pb format, h5 format, or darknet format.

In one possible implementation, the output code is in a C language format or a bin file format.

In a second aspect, there is provided a model translation apparatus, including: the translation system comprises a determining module, a translation module and a training module, wherein the determining module is used for acquiring a model to be translated and a training data set, and the model to be translated comprises a plurality of neural network layers; the layer conversion module is used for converting the neural network layer of the model to be translated into a target neural network layer supported by a target processor to obtain a target model; and the training module is used for training the target model according to the training data set so as to determine the parameters of each target neural network layer in the target model.

In one possible embodiment, the determining module is configured to: receiving the program code of the model to be translated based on the programming interface, and determining the model to be translated; wherein the programming interface disables functions that are not supported by the target processor.

In one possible embodiment, the layer conversion module is configured to: determining a layer mapping relation between a neural network layer in the model to be translated and a target neural network layer supported by a target processor based on an equivalent relation between a function of the neural network layer in the model to be translated and a function of the target neural network layer supported by the target processor; and equivalently converting the neural network layer in the model to be translated into a target neural network layer supported by the target processor based on the layer mapping relation.

In one possible embodiment, the layer conversion module is configured to: determining a layer mapping relation between a designated neural network layer in the model to be translated and a target neural network layer supported by a target processor based on the equivalent relation; and/or determining a layer mapping relation between a specified neural network layer in the model to be translated and a target neural network layer combination supported by the target processor based on the equivalent relation; and/or determining a layer mapping relation between a specified neural network layer combination in the model to be translated and a target neural network layer supported by the target processor based on the equivalent relation; and/or determining a layer mapping relation between the designated neural network layer combination in the model to be translated and the target neural network layer combination supported by the target processor based on the equivalent relation.

In one possible embodiment, the training module is configured to: inputting sample data in the training data set into the target model to determine a numerical range of input values of at least one target neural network layer of the target model; and determining the quantization parameter of at least one target neural network layer in the target model according to the numerical range of the input value and a preset quantization rule.

In one possible embodiment, the training module is further configured to: acquiring an object to be processed of the target model; inputting the object to be processed into the target model to determine a numerical range of input values of at least one target neural network layer of the target model; and determining the quantization parameter of at least one target neural network layer in the target model according to the numerical range of the input value and a preset quantization rule.

In a possible embodiment, the apparatus further comprises: and the code generation module is used for generating output codes according to each target neural network layer of the target model and corresponding parameters.

In a third aspect, a model translation apparatus is provided, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform: the model translation method of the first aspect.

In a fourth aspect, there is provided a computer-readable storage medium storing a program which, when executed by a processor, causes the processor to perform the model translation method of the first aspect.

The embodiment of the application adopts at least one technical scheme which can achieve the following beneficial effects: in the embodiment of the application, each neural network layer of the model to be translated, which is created under any frame, is converted into the target neural network layer supported by the target processor, so that the target model supported by the target processor is obtained through conversion, and the neural network model trained by a specific frame can be operated on the target processor.

It should be understood that the above description is only an overview of the technical solutions of the present application, so as to enable the technical solutions of the present application to be more clearly understood, and thus can be implemented according to the content of the description. In order to make the aforementioned and other objects, features and advantages of the present application more comprehensible, embodiments of the present application are described below.

Drawings

The advantages and benefits described herein, as well as other advantages and benefits, will be apparent to those of ordinary skill in the art upon reading the following detailed description of the exemplary embodiments. The drawings are only for purposes of illustrating exemplary embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to refer to like elements throughout. In the drawings:

FIG. 1 is a schematic flow chart diagram of a model translation method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of converting a neural network layer of a model to be translated into a target neural network layer supported by a target processor according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a model translation apparatus according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a model translation apparatus according to another embodiment of the present application.

In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

In this application, it is to be understood that terms such as "including" or "having" are intended to indicate the presence of the disclosed features, numbers, steps, acts, components, parts, or combinations thereof, and are not intended to preclude the presence or addition of one or more other features, numbers, steps, acts, components, parts, or combinations thereof.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

The embodiment of the invention provides a model translation method, and the invention concept of the method is introduced firstly.

After an engineer creates a model with a corresponding format on a neural network model framework interface such as a common TensorFlow framework, Caffe framework or Darknet framework, the model can only normally run on a special processor supporting the corresponding format, but cannot be widely applied to other processors, such as a K210@ processor. Based on this, an embodiment of the present invention provides a model translation method, which may specifically include obtaining a to-be-translated model generated under a common neural network model framework, where the to-be-translated model is a neural network model including a plurality of neural network layers, and converting each neural network layer of the to-be-translated model into a corresponding neural network layer supported by a target processor to obtain a target model; training the target model according to the training data set to determine parameters of each neural network layer in the target model; and generating an output code according to each neural network layer of the target model and the corresponding parameters. Therefore, the neural network model generated under the common framework can be simply translated into the target model supported by the target processor, and the problem of incompatibility is avoided.

Those skilled in the art will appreciate that the described application scenario is only one example in which an embodiment of the present invention may be implemented. The scope of applicability of the embodiments of the present invention is not limited in any way. Having described the general principles of the invention, various non-limiting embodiments of the invention are described in detail below.

Fig. 1 is a schematic flowchart of a model translation method 100 according to an embodiment of the present application, for translating a target-processor-unsupported model into a target model supported by an actually executed target processor, in which from a device perspective, an execution subject may be one or more electronic devices, and more specifically, a processing module of a neural network device; from the program perspective, the execution main body may accordingly be a program loaded on these electronic devices.

As shown in fig. 1, the method 100 may include:

101, obtaining a model to be translated and a training data set;

wherein, the model to be translated is a neural network-based model which comprises at least one neural network layer. The neural network layer of the model may also be considered a function, for example. The model to be translated may be a neural network model created by the commonly used TensorFlow framework, Caffe framework, Darknet framework, or other framework. The training data set includes a large amount of sample data, such as image data or speech segments, for model training.

In one possible embodiment, the model to be translated is in pb format, h5 format, or darknet format. It is understood that the TensorFlow framework creates a neural network model in pb format, the Caffe framework creates a neural network model in h5 format, and the Darknet framework creates a neural network model in Darknet format.

In a possible implementation manner, the obtaining of the model to be translated in step 101 may further include: providing a programming interface and controlling the programming interface to disable functions which are not supported by the target processor; program codes of the model to be translated are received based on the programming interface, and the model to be translated is determined.

It should be understood that, in order to ensure the success rate of the conversion, a programming interface for inputting the program code of the model to be translated may be provided for a programmer to input the model to be translated, wherein the function which is not supported by the target processor is disabled in the programming interface. In particular, the programming interface may be one that provides a plurality of Application Program Interfaces (APIs) for selection by a programmer. The application program interfaces that the programmer can select are all functions that can be translated into the target processor support.

102, converting a neural network layer of a model to be translated into a target neural network layer supported by a target processor to obtain a target model;

the layers of the neural network model, i.e., the neural network layers, may be thought of as functions that receive input data and perform specific function calculations and output data after the calculations are completed. Thus, converting the neural network layer of the model to be translated into a target neural network layer supported by the target processor may also be understood as an alternative to an equivalent function.

In a possible embodiment, in order to keep the function of the model consistent before and after translation, the step 102 may further include: determining a layer mapping relation between a neural network layer in the model to be translated and a target neural network layer supported by a target processor based on an equivalent relation between a function of the neural network layer in the model to be translated and a function of the target neural network layer supported by the target processor; and equivalently converting the neural network layer in the model to be translated into a target neural network layer supported by the target processor based on the layer mapping relation.

For example, a function library supported by the target processor is obtained, where the function library includes a plurality of functions that can be processed by the target processor, such as functions a ', b', c ', d' (each function corresponds to a neural network layer a, b, c, d, respectively). Referring to fig. 2, an example of converting a model to be translated into a target model is shown, wherein data to be processed is input into the model to be translated, and the data to be processed is processed sequentially through neural network layers A, B, C and D (each neural network layer corresponds to a function a ', B', C ', D', respectively), that is, from a computational perspective, it can be understood that the data to be processed is sequentially computed through a function a ', a function B', a function C ', and a function D'. Equivalent conversion can be performed on each neural network layer in the model to be translated, and the meaning of the equivalent conversion is that data processing logics before and after the conversion should be kept consistent. For example, the function a ' and the function a ' perform substantially the same function calculation function on the data to be processed, and the function a ' may be considered to be equivalent, that is, the neural network layer a in the model to be translated may be equivalently converted into the target neural network layer a in the target model; for another example, if the combination of the function B 'and the function C' and the function B 'perform substantially the same function calculation function on the calculation results of the neural network layer a and the target neural network layer a, the combination of the functions B' and C 'and the function C' may be considered to be equivalent, that is, the neural network layer B, C in the model to be translated may be equivalently converted into the target neural network layer B in the target model. In fig. 2, an example of specifying a combination of neural network layers is given as the neural network layer B and the neural network layer C, however, it should be understood that the present embodiment does not limit the order of the neural network layers and the number of the neural network layers in the specified combination of neural network layers, and in other schemes, two or more neural network layers in series or in parallel may be used as the specified combination of neural network layers, or a plurality of neural network layers in series and parallel may be used as the specified combination of neural network layers, as long as the actual model situation is met. For another example, it can be considered that the function D ' processing of the data to be processed has the same effect as the function D ' and the function e '. The combination of the function D 'and the functions c' and D 'performs substantially the same function calculation function on the input data, and the function D' and the functions c 'and D' may be considered to be equivalent, that is, the neural network layer D in the model to be translated may be equivalently converted into the target neural network layers c and D in the target model, and an example of the combination of the neural network layers c and D as the target neural network layers is given in fig. 2 with reference to the description of specifying the neural network layer combination, however, it should be understood that the present embodiment does not limit the order of the neural network layers and the number of the neural network layers in the target neural network layer combination.

103, training the target model according to the training data set to determine parameters of each target neural network layer in the target model;

in a possible implementation, step 103 may further include: inputting sample data in the training data set into the target model to determine the numerical range of the input value of at least one target neural network layer of the target model; and determining the quantization parameter of at least one target neural network layer in the target model according to the numerical range of the input value and a preset quantization rule.

The activation output of each target neural network layer of the target model is a floating-point type numerical value, which results in large occupied storage space and low operation efficiency. Further, in order to save memory space and improve operation efficiency, the present embodiment may perform quantization processing on the activation output data of the at least one target neural network layer of the target model (that is, the quantization processing is equivalent to performing quantization processing on the input value of the at least one target neural network layer), where the quantization processing may specifically be to quantize the activation output data of the at least one target neural network layer of the target model from floating point type data (for example, 32-bit floating point type data, hereinafter abbreviated as FP32) to fixed point type data of lower bits (for example, 8-bit fixed point type data, hereinafter abbreviated as INT8), so as to reduce the number of used computing resources. Referring to fig. 2, for example, after sample data in a training data set is input to a target model, a maximum value and a minimum value of activation outputs of a target neural network layer a of the target model may be taken as a numerical range of input values of a target neural network layer b of the target model. And constructing linear mapping or nonlinear mapping of floating point type data (such as FP32 data) to fixed point type data (such as INT8 data) based on the determined numerical range of the input value of the target neural network layer b to obtain the quantization parameter of the target neural network layer b of the target model.

It should be noted that, for the first layer of the target model, it can directly quantize its input value according to the quantization parameter. For the intermediate or last layer of the target model, the quantization parameter for this layer can only be determined after the value range of the output value of the layer preceding it has been determined. In other words, when the target model is finally run, the input value of a certain layer is not the original input value, but is the quantized input value; or the input value of each layer may be a quantized input value.

Optionally, the range of values of the input values of at least one target neural network layer in the target model may be determined according to the training data set, or may be determined according to another data set.

In another possible implementation, step 103 may further include: acquiring an object to be processed of a target model; inputting an object to be processed into a target model to determine a numerical range of input values of at least one target neural network layer of the target model; and determining the quantization parameter of at least one target neural network layer in the target model according to the numerical range of the input value and a preset quantization rule.

Different from the above embodiment, in the present embodiment, a data object actually processed by the target model during subsequent work is used as a to-be-processed object to be input into the target model to determine a numerical range of an input value of at least one target neural network layer of the target model, for example, if the target model is subsequently used for face recognition of door access, a face image acquired by a door access camera can be used as the to-be-processed object of the target model to be input into the target model, and accordingly, a more accurate numerical range can be acquired. For another example, if the target model is subsequently used for voice recognition of the smart speaker, real voice information may be input into the target model as an object to be processed to obtain a more accurate numerical range.

In one possible embodiment, after step 103, an output code may also be generated according to each target neural network layer of the target model and corresponding parameters. This output code can be added to other program code of a programmer to obtain a program that can be run on a general purpose processor, for example a K210@ processor.

In one possible embodiment, the output code is in a C language format or a bin file format. In this manner, the neural network may be run on a general purpose processor.

Based on the same technical concept, the embodiment of the present application further provides a model translation apparatus, which is used for executing the model translation method provided by any of the above embodiments. Fig. 3 is a schematic structural diagram of a model translation apparatus according to an embodiment of the present application.

As shown in fig. 3, the model translation apparatus includes: a determining module 301, configured to obtain a model to be translated and a training data set; the model to be translated comprises at least one neural network layer; the layer conversion module 302 is configured to convert a neural network layer of a model to be translated into a target neural network layer supported by a target processor, so as to obtain a target model; the training module 303 is configured to train the target model according to the training data set to determine parameters of each target neural network layer in the target model.

In one possible implementation, the determining module 301 is configured to: receiving the program code of the model to be translated based on the programming interface, and determining the model to be translated; wherein the programming interface disables functions that are not supported by the target processor.

In one possible implementation, the layer conversion module 302 is configured to: determining a layer mapping relation between a neural network layer in the model to be translated and a target neural network layer supported by a target processor based on an equivalent relation between a function of the neural network layer in the model to be translated and a function of the target neural network layer supported by the target processor; and equivalently converting the neural network layer in the model to be translated into a target neural network layer supported by the target processor based on the layer mapping relation.

In one possible implementation, the layer conversion module 302 is configured to: determining a layer mapping relation between a designated neural network layer in the model to be translated and a target neural network layer supported by a target processor based on the equivalent relation; and/or determining a layer mapping relation between a specified neural network layer in the model to be translated and a target neural network layer combination supported by the target processor based on the equivalent relation; and/or determining a layer mapping relation between a specified neural network layer combination in the model to be translated and a target neural network layer supported by the target processor based on the equivalent relation; and/or determining a layer mapping relation between the designated neural network layer combination in the model to be translated and the target neural network layer combination supported by the target processor based on the equivalent relation.

In one possible implementation, the training module 303 is configured to: inputting sample data in the training data set into the target model to determine the numerical range of the input value of at least one target neural network layer of the target model; and determining the quantization parameter of at least one target neural network layer in the target model according to the numerical range of the input value and a preset quantization rule.

In one possible implementation, the training module 303 is further configured to: acquiring an object to be processed of a target model; inputting an object to be processed into a target model to determine a numerical range of input values of at least one target neural network layer of the target model; and determining the quantization parameter of at least one target neural network layer in the target model according to the numerical range of the input value and a preset quantization rule.

In one possible embodiment, the apparatus further comprises: and the code generation module is used for generating output codes according to each target neural network layer of the target model and the corresponding parameters.

In one possible embodiment, the output code is in a C language format or a bin file format.

It should be noted that the apparatus in the embodiment of the present application may implement each process of the foregoing method embodiment, and achieve the same effect and function, which are not described herein again.

Fig. 4 is a model translation apparatus according to an embodiment of the present application, configured to perform the model translation method shown in fig. 1, and the apparatus includes: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to execute the model translation method shown in the above embodiments.

According to some embodiments of the present application, a computer-readable storage medium stores a program that, when executed by a multi-core processor, causes the multi-core processor to perform the model translation method shown in the above-described embodiments.

The embodiments in the present application are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, device, and computer-readable storage medium embodiments, the description is simplified because they are substantially similar to the method embodiments, and reference may be made to some descriptions of the method embodiments for their relevance.

The apparatus, the computer-readable storage medium and the method provided in the embodiment of the present application are in one-to-one correspondence, and therefore, the apparatus, the device and the computer-readable storage medium also have similar beneficial technical effects to those of the corresponding method.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. Moreover, while the operations of the method of the invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

While the spirit and principles of the invention have been described with reference to several particular embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit. The invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A method of model translation, comprising:

obtaining a model to be translated and a training data set, wherein the model to be translated comprises at least one neural network layer;

converting the neural network layer of the model to be translated into a target neural network layer supported by a target processor to obtain a target model;

and training the target model according to the training data set to determine parameters of each target neural network layer in the target model.

2. The model translation method according to claim 1, wherein the obtaining of the model to be translated comprises:

receiving the program code of the model to be translated based on the programming interface, and determining the model to be translated;

wherein the programming interface disables functions that are not supported by the target processor.

3. The model translation method according to claim 1, wherein converting the neural network layer of the model to be translated into a target neural network layer supported by a target processor comprises:

determining a layer mapping relation between a neural network layer in the model to be translated and a target neural network layer supported by the target processor based on an equivalent relation between a function of the neural network layer in the model to be translated and a function of the target neural network layer supported by the target processor;

and equivalently converting the neural network layer in the model to be translated into a target neural network layer supported by the target processor based on the layer mapping relation.

4. The model translation method according to claim 3, wherein a layer mapping relationship between a neural network layer in the model to be translated and a target neural network layer supported by the target processor is determined, the method further comprising:

determining a layer mapping relation between a designated neural network layer in the model to be translated and a target neural network layer supported by the target processor based on the equivalent relation; and/or the presence of a gas in the gas,

determining a layer mapping relation between a designated neural network layer in the model to be translated and a target neural network layer combination supported by the target processor based on the equivalent relation; and/or the presence of a gas in the gas,

determining a layer mapping relation between a designated neural network layer combination in the model to be translated and a target neural network layer supported by the target processor based on the equivalent relation; and/or the presence of a gas in the gas,

determining a layer mapping relationship between a specified neural network layer combination in the model to be translated and a target neural network layer combination supported by the target processor based on the equivalence relationship.

5. The method of model translation of claim 1, the method further comprising:

inputting sample data in the training data set into the target model to determine a numerical range of input values of at least one target neural network layer of the target model;

and determining the quantization parameter of at least one target neural network layer in the target model according to the numerical range of the input value and a preset quantization rule.

6. The method of model translation of claim 1, the method further comprising:

acquiring an object to be processed of the target model;

inputting the object to be processed into the target model to determine a numerical range of input values of at least one target neural network layer of the target model;

7. The method of model translation according to any of claims 1-6, wherein after determining parameters of each target neural network layer in the target model, the method further comprises:

and generating an output code according to each target neural network layer of the target model and the corresponding parameters.

8. The model translation method according to any one of claims 1 to 6, wherein the model to be translated is in pb format, h5 format, or darknet format.

9. The model translation method according to any one of claims 1 to 6, wherein the output code is in a C language format or a bin file format.

10. A model translation apparatus, comprising:

the translation system comprises a determining module, a translation module and a training module, wherein the determining module is used for acquiring a model to be translated and a training data set, and the model to be translated comprises at least one neural network layer;

the layer conversion module is used for converting the neural network layer of the model to be translated into a target neural network layer supported by a target processor to obtain a target model;

and the training module is used for training the target model according to the training data set so as to determine the parameters of each target neural network layer in the target model.

11. The model translation device of claim 10, wherein the determination module is configured to:

12. The model translation device of claim 10, wherein the layer conversion module is configured to:

13. The model translation device of claim 12, wherein the layer conversion module is configured to:

14. The model translation device of claim 10, wherein the training module is configured to:

15. The model translation device of claim 10, wherein the training module is further configured to:

acquiring an object to be processed of the target model;

16. The model translation apparatus according to any one of claims 10 to 15, wherein the apparatus further comprises:

and the code generation module is used for generating output codes according to each target neural network layer of the target model and corresponding parameters.

17. The model translation apparatus according to any one of claims 10 to 15, wherein the model to be translated is in pb format, h5 format, or darknet format.

18. The model translation apparatus according to any one of claims 10 to 15, wherein the output code is in a C language format or a bin file format.

19. A model translation apparatus, comprising:

at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform: the model translation method of any one of claims 1-9.

20. A computer-readable storage medium characterized in that the computer-readable storage medium stores a program which, when executed by a processor, causes the processor to execute the model translation method according to any one of claims 1 to 9.