CN113947207A - Management method, system and device applied to model conversion and electronic equipment - Google Patents

Management method, system and device applied to model conversion and electronic equipment Download PDF

Info

Publication number
CN113947207A
CN113947207A CN202010682147.9A CN202010682147A CN113947207A CN 113947207 A CN113947207 A CN 113947207A CN 202010682147 A CN202010682147 A CN 202010682147A CN 113947207 A CN113947207 A CN 113947207A
Authority
CN
China
Prior art keywords
model
deep learning
learning model
target
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010682147.9A
Other languages
Chinese (zh)
Inventor
周智强
叶挺群
王鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN202010682147.9A priority Critical patent/CN113947207A/en
Publication of CN113947207A publication Critical patent/CN113947207A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Stored Programmes (AREA)

Abstract

The embodiment of the application provides a management method, a management system, a management device and electronic equipment applied to model conversion. In the embodiment, the second deep learning model matched with the target AI device is embodied in the target deep learning model, and the model description information for describing the second deep learning model is also embodied, so that key information required when the target deep learning model is used for deep learning inference can be visually marked, fragment information such as AI device information, a deep learning inference frame, a model training frame and the like corresponding to the target deep learning model does not need to be additionally recorded, and the unified management of the model conversion process of a multi-AI device platform (a platform of AI devices mixed with different application scenes) is realized; further, when the model description information for describing the second deep learning model is embodied in the target deep learning model, the target deep learning model may be subsequently upgraded based on the model description information, and the like.

Description

Management method, system and device applied to model conversion and electronic equipment
Technical Field
The present disclosure relates to computer technologies, and in particular, to a management method, a management system, a management apparatus, and an electronic device for model conversion.
Background
AI devices can be used for deep learning training and deep learning reasoning. The deep learning training means that: a group of training data sets (data with labels) are sent to a neural network, the connection weight of each network layer in the neural network is adjusted according to the difference between the output information of the neural network and the preset expected output information, and finally an ideal neural network is trained. The deep learning inference means: and (3) carrying out data inference by using the trained neural network, such as: identifying images, identifying recorded spoken language, detecting diseases in the blood, or recommending clothing that fits one's style, etc. As an example, the AI device may be a device that deploys an AI chip, which may be used to implement the deep learning training and deep learning reasoning described above.
Compared with deep learning training, deep learning reasoning has high requirements on high performance. In order to fully exert the high performance of the AI devices, each AIAI device manufacturer configures a corresponding deep learning inference framework for the shipped AI devices.
Currently, when a deep learning model is deployed on an AI device, the following problems often occur: the deep learning model is not matched with the AI device (for example, the deep learning model is not matched with the deep learning inference frame corresponding to the AI device, learning inference cannot be performed based on the deep learning inference frame corresponding to the AI device, etc.). And once the deep learning model is not matched with the AI device, the deep learning model cannot rely on a deep learning inference framework corresponding to the AI device to perform learning inference and the like.
Disclosure of Invention
The application provides a management method, a management system, a management device and electronic equipment applied to model conversion, so that conversion management is carried out on a deep learning model when the deep learning model is not matched with AI equipment, and the converted deep learning model is matched with the AI equipment.
The method provided by the application comprises the following steps:
a management method applied to model transformation, the method comprising:
obtaining model conversion information of a first deep learning model to be converted, wherein the model conversion information at least comprises: a first deep learning model, and AI device information of a target AI device; the target AI device is an AI device to be deployed with the first deep learning model, and the first deep learning model is not matched with the target AI device;
converting the first deep learning model to obtain a second deep learning model matched with the AI equipment information of the target AI equipment;
and generating a target deep learning model of the first deep learning model according to the second deep learning model and the model description information for describing the second deep learning model.
As one embodiment, the converting the first deep learning model includes:
and forwarding the model conversion information to the target AI device corresponding to the AI device information, so that the target AI device converts the first deep learning model through the configured model conversion tool.
As an embodiment, the generating of the target deep learning model of the first deep learning model from the second deep learning model and the model description information for describing the second deep learning model includes:
packaging the model description information according to a specified model management protocol to obtain model packaging information;
loading the model packaging information at a specified position of the second deep learning model;
and packaging the second deep learning model loaded with the model packaging information at the specified position to obtain the target deep learning model.
As an embodiment, the encapsulating the model description information according to a specified model management protocol to obtain model encapsulation information includes:
encapsulating the model description information and the model conversion information according to a specified model management protocol to obtain model encapsulation information; alternatively, the first and second electrodes may be,
and encapsulating the model description information and the information except the first depth model in the model conversion information according to a specified model management protocol to obtain the model encapsulation information.
As an embodiment, the software version information includes at least one of the following information:
the lowest version information of the inference software stack is used for carrying out deep learning inference by utilizing the second deep learning model, and the type of the inference software stack is used for carrying out deep learning inference by utilizing the second deep learning model.
As an embodiment, the model parameters comprise at least one of the following parameters:
a model weight data format corresponding to the second deep learning model;
a model type of the second deep learning model; the model types include at least: a quantization type for indicating that the second deep learning model is a quantization model;
and when the model type of the second deep learning model is a quantization type, the second deep learning model corresponds to a quantization parameter.
As an embodiment, the method further comprises:
and encrypting the target deep learning model according to a set encryption algorithm and storing the encrypted target deep learning model.
A management system for application to model transformation, the system comprising:
the system comprises a client, a network model management server and N AI devices; the N AI devices comprise at least one cloud AI device, and/or at least one terminal side AI device, and/or at least one network edge side AI device; n is greater than or equal to 1; the client and the N AI devices are respectively connected with the network model management server through a network;
the network model management server obtains model conversion information of a first deep learning model to be converted from a client, wherein the model conversion information at least comprises: a first deep learning model, and AI device information of a target AI device; the target AI device is the AI device to be deployed with the first deep learning model in the N AI devices, and the first deep learning model is not matched with the target AI device; and
and converting the first deep learning model to obtain a second deep learning model matched with the AI equipment information of the target AI equipment, and generating the target deep learning model of the first deep learning model according to the second deep learning model and the model description information for describing the second deep learning model.
A management apparatus applied to model conversion, the apparatus being applied to a network model management server, the apparatus comprising:
an obtaining unit, configured to obtain model conversion information of a first deep learning model to be converted, where the model conversion information at least includes: a first deep learning model, and AI device information of a target AI device; the target AI device is an AI device to be deployed with the first deep learning model, and the first deep learning model is not matched with the target AI device;
the first processing unit is used for converting the first deep learning model to obtain a second deep learning model matched with the target AI device;
and the second processing unit is used for generating a target deep learning model of the first deep learning model according to the second deep learning model and model description information for describing the second deep learning model.
An electronic device, comprising: a processor and a machine-readable storage medium;
the machine-readable storage medium stores machine-executable instructions executable by the processor;
the processor is configured to execute machine-executable instructions to implement the method steps disclosed above.
According to the technical scheme, in the embodiment, the second deep learning model matched with the target AI device is embodied in the target deep learning model, so that the first deep learning model is subjected to conversion management when the first deep learning model is not matched with the target AI device, and the target deep learning model is matched with the target AI device;
further, in the embodiment, by embodying both the second deep learning model matched with the target AI device and the model description information for describing the second deep learning model in the target deep learning model, it is possible to visually mark key information (specifically, the model description information described above) required when deep learning inference is performed by using the target deep learning model, and it is not necessary to additionally record fragment information such as AI device information, a deep learning inference frame, a model training frame and the like corresponding to the target deep learning model, thereby implementing unified management of the model conversion process of a multi-AI device platform (a platform of AI devices mixed with different application scenarios);
still further, when the model description information for describing the second deep learning model is embodied in the target deep learning model, the target deep learning model may be subsequently upgraded based on the model description information, and the like.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a schematic diagram of a model transformation provided in an embodiment of the present application;
FIG. 2 is a flow chart of a method provided by an embodiment of the present application;
FIG. 3 is a flowchart of an implementation of step 203 provided by an embodiment of the present application;
FIG. 4 is a block diagram of a system provided in an embodiment of the present application;
FIG. 5 is a block diagram of an apparatus according to an embodiment of the present disclosure;
fig. 6 is a hardware structure diagram of the device according to the embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
In order to make the technical solutions provided in the embodiments of the present application better understood and make the above objects, features and advantages of the embodiments of the present application more comprehensible, the technical solutions in the embodiments of the present application are described in further detail below with reference to the accompanying drawings.
As an embodiment, when the deep learning model to be deployed does not match the AI device, the deep learning model is converted so that the converted deep learning model matches the AI device. In an example, the deep learning model may be converted by a configured model conversion tool, and how the deep learning model is converted by the model conversion tool is described below by way of example, which is not described herein for the sake of brevity.
Take the following three deep learning models as an example: a model 101a (corresponding training frame is Caffe, also called Caffe model), a model 102a (corresponding training frame is tensoflow, also called tensoflow model), and a model 103a (corresponding training frame is Pytorch, also called Pytorch model). If the three deep learning models need to be deployed on three AI devices: AI device 101b, AI device 102b, AI device 103b, then:
when the model 101a does not match the AI device 101b, taking the example that the deep learning inference frame corresponding to the model 101a does not match the AI device 101b (specifically, the AI chip _0 on the AI device 101 b), the model 101a is subjected to model transformation to obtain a model (denoted as the model 101c) matching the deep learning inference frame corresponding to the AI device 101b (specifically, the AI chip _0 on the AI device 101 b). The model 101c is then deployed on the AI device 101b (which may specifically be chip _0 in the AI device 101 b).
When the model 101a does not match the AI device 102b, taking the example that the deep learning inference frames corresponding to the model 101a and the AI device 102b (specifically, the AI chip _1 on the AI device 102 b) do not match, the model 101a is subjected to model transformation to obtain a model (denoted as the model 102c) matching the deep learning inference frame corresponding to the AI device 102b (specifically, the AI chip _1 on the AI device 102 b). The model 102c is then deployed on the AI device 102b (which may specifically be chip _1 in the AI device 102 b).
When the model 101a does not match the AI device 103b, taking the example that the deep learning inference frame corresponding to the model 101a does not match the AI device 103b (specifically, the AI chip _2 in the AI device 103 b), the model 101a is subjected to model transformation to obtain a model (denoted as the model 103c) matching the deep learning inference frame corresponding to the AI device 103b (specifically, the AI chip _2 in the AI device 103 b). The model 103c is then deployed on the AI device 103b (which may specifically be chip _2 in the AI device 103 b).
Similar to the model 101a, the models 102a and 103a are deployed on the AI device 101b, the AI device 102b and the AI device 103b in a processing manner similar to the model 101a, and are not described here again.
As can be understood from the above description, by performing model conversion on the deep learning model to match the converted deep learning model with the AI devices, the matched deep learning model can be finally configured on each AI device.
In an example, the AI devices 101b to 103b shown in fig. 1 are not limited to AI devices in the same application scenario, and may be AI devices in different application scenarios, such as a cloud, a terminal side, and a network edge side.
As an embodiment, after the deep learning model is converted, for example, after the model 101a is converted to obtain a model (denoted as a model 101c) matched with a deep learning inference frame corresponding to the AI device 101b (specifically, an AI chip _0 on the AI device 101 b), the converted deep learning model needs to be further managed in a unified manner to realize that the AI device finally deploys the matched deep learning model, which is specifically described in the flow shown in fig. 2 below.
In order to implement unified management on the converted deep learning model, the present application provides the process shown in fig. 2.
Referring to fig. 2, fig. 2 is a flowchart of a method provided by an embodiment of the present application. The process is applied to a network model management server. In one example, the network model management server herein may be a device newly added between the client and the AI device. In another example, the network model management server may also be integrated into an existing device, such as a client, an AI device, and the like, and the embodiment is not particularly limited.
As shown in fig. 2, the process may include the following steps:
step 201, obtaining model conversion information of a first deep learning model to be converted, where the model conversion information at least includes: a first deep learning model, and AI device information of a target AI device; the target AI device is an AI device to be deployed with a first deep learning model, and the first deep learning model is not matched with the target AI device.
It should be noted that, here, the first deep learning model is only named for convenience of description and is not meant to be limiting. In one example, the first deep learning model may be an initially trained deep learning model.
As an embodiment, when a user determines to deploy the first deep learning model on an AI device (denoted as a target AI device) according to actual needs, if the first deep learning model is found not to match the target AI device, a first model conversion request is sent to the network model management server through the client, where the first model conversion request carries the model conversion information. At this point, the network model management server obtains model conversion information of the first deep learning model to be converted. Optionally, here, the first deep learning model not matching the target AI device may be: and the training frame corresponding to the first deep learning model is not matched with the training frame corresponding to the target AI device, or the inference software stack version information of the deep learning inference frame and the inference software stack version information corresponding to the target AI device are not matched by operating the first deep learning model, and the like.
Based on the above description of the first model conversion request, as an embodiment, in this step 201, obtaining the model conversion information of the first deep learning model to be converted may include:
receiving a first model conversion request from a client, wherein the first model conversion request carries the model conversion information; model transformation information is obtained from the first model transformation request.
As depicted in step 201, optionally, the model transformation information includes at least: a first deep learning model, and AI device information for a target AI device. As one embodiment, the AI device information of the target AI device may include: device capability information such as a device identification of the target AI device (e.g., a major model, a minor model, etc. of an AI chip on the target AI device), a set of computing instructions supported by the target AI device, etc. Alternatively, when the target AI device includes a plurality of AI chip sub-models, the device capability information herein may be capability information (software and hardware information) that is supported by a specific AI chip on the target AI device. In one example, the designated AI chip herein may be the AI chip on the target AI device having the highest version capability level of software and hardware information (i.e., chip performance). The reason why the above-mentioned equipment capability information is set as the software and hardware information of the highest version capability level is that: the redundancy of the software and hardware information of the highest version capability level on a calculation instruction set and a software stack is high, and the software and hardware resources required to be deployed for model conversion can be effectively saved.
Step 202, the first deep learning model is converted to obtain a second deep learning model matched with the AI device information of the target AI device.
As an embodiment, in this step 202, the model transformation information may further include: and the training frame corresponds to the first deep learning model. In an example, the training frame corresponding to the first deep learning model may be Caffe, tensrflow, pitorch, etc., and the embodiment is not limited in detail.
Optionally, in this embodiment, the training frame corresponding to the first deep learning model may be obtained simultaneously with the first deep learning model, or may be obtained sequentially with the first deep learning model, and this embodiment is not limited in particular.
As an embodiment, the converting of the first deep learning model in step 202 may be implemented by a network model management server, and specifically may be: the network model management server can determine a corresponding model conversion mechanism by combining a training frame corresponding to the first deep learning model in the model conversion information and AI device information of the target AI device, and then convert the first deep learning model through the previously acquired model conversion tool corresponding to the target AI device and based on the model conversion mechanism to obtain a second deep learning model matched with the target AI device. Here, the model conversion tool corresponding to the target AI device is configured to, when the deep learning model to be deployed on the target AI device does not match the target AI device, perform model conversion on the deep learning model to be deployed on the target AI device to obtain the deep learning model matching the target AI device.
Optionally, in an embodiment, the network model management server converting the first deep learning model through a previously obtained model conversion tool corresponding to the target AI device and based on the model conversion mechanism may include: when the training frame corresponding to the first deep learning model is not matched with the training frame supported by the target AI device, the training frame corresponding to the first deep learning model is converted through the previously acquired model conversion tool corresponding to the target AI device and based on the model conversion mechanism, so that the training frame corresponding to the converted first deep learning model is matched with the training frame supported by the target AI device (i.e., a second deep learning model matched with the target AI device is obtained).
Optionally, in another embodiment, the network model management server converting the first deep learning model based on the model conversion mechanism by the previously obtained model conversion tool corresponding to the target AI device may include:
when the first deep learning model is not matched with the deep learning inference frame supported by the target AI device (for example, the first algorithm of the first deep learning model is not matched with the deep learning inference frame supported by the target AI device, and the first deep learning model cannot be operated to perform deep learning inference based on the deep learning inference frame supported by the target AI device), the first deep learning model is converted through the previously acquired model conversion tool corresponding to the target AI device and based on the model conversion mechanism, so that the converted deep learning model is operated to perform deep learning inference based on the deep learning inference frame supported by the target AI device (that is, a second deep learning model matched with the target AI device is obtained).
As another embodiment, to relieve the pressure of the network model management server, the conversion of the first deep learning model in step 202 may also be implemented by the target AI device, which may specifically be: and forwarding the model conversion information to the target AI device, determining a corresponding model conversion mechanism by the target AI device according to AI device information in the model conversion information and a training frame, and converting the first deep learning model by the configured model conversion tool based on the model conversion mechanism. Here, the target AI device converts the first deep learning model through the configured model conversion tool and based on the model conversion mechanism, which is described in the description of the network model management server for converting the first deep learning model, and is not described herein again.
Finally, a second deep learning model that matches the AI device information of the target AI device may be obtained, via step 202. It should be noted that, in order to uniformly manage the model conversion process of the multiple AI device platforms (platforms that mix AI devices in different application scenarios), the second deep learning model is not the final target deep learning model at this time, and in this embodiment, the second deep learning model needs to be further processed, specifically, see step 203.
And step 203, generating a target deep learning model of the first deep learning model according to the second deep learning model and the model description information for describing the second deep learning model.
Optionally, in this step 203, the model description information at least includes: and the software version information is used for operating the second deep learning model to carry out deep learning inference, and/or the model parameters of the second deep learning model.
Optionally, as an embodiment, the software version information here includes at least one of the following information: the lowest version information of the inference software stack is used for carrying out deep learning inference by utilizing the second deep learning model, and the type of the inference software stack is used for carrying out deep learning inference by utilizing the second deep learning model. In one example, to facilitate tracking of the trace target deep learning model, the software version information herein may further include: software version information of the model conversion tool, etc.
Optionally, as an embodiment, the model parameters may be some key parameters for describing the second deep learning model, which will be described in the following by way of example, and are not described herein again.
As an embodiment, in step 203, there are many implementation manners of generating the target deep learning model of the first deep learning model according to the second deep learning model and the model description information for describing the second deep learning model, but in any implementation manner, the finally obtained target deep learning model embodies the second deep learning model and the model description information for describing the second deep learning model. The flow shown in fig. 3 below describes an implementation manner of step 203 by way of example, and details are not repeated here.
In the embodiment, the model description information for describing the second deep learning model is embodied in the target deep learning model, so that key information (specifically, the model description information described above) required when the target deep learning model is used for deep learning inference is visually marked, fragment information such as AI equipment information, a deep learning inference frame, a model training frame and the like corresponding to the target deep learning model does not need to be additionally recorded, and unified management of a model conversion process of a multi-AI equipment platform (a platform of AI equipment mixed with different application scenes) is realized.
Further, when the model description information for describing the second deep learning model is embodied in the target deep learning model, the target deep learning model may be subsequently upgraded based on the model description information, and the like.
Thus, the flow shown in fig. 2 is completed.
Through the process shown in fig. 2, the target deep learning model finally obtained by the embodiment is matched with the target AI device, which realizes that the conversion management is performed on the first deep learning model when the first deep learning model is not matched with the target AI device, so that the target deep learning model is matched with the target AI device.
Further, in this embodiment, the target deep learning model not only embodies the second deep learning model matched with the target AI device, but also embodies the model description information for describing the second deep learning model, which can visually mark the key information (specifically, the model description information described above) required for deep learning inference by using the target deep learning model, and does not need to additionally record fragment information such as AI device information, a deep learning inference frame, a model training frame and the like corresponding to the target deep learning model, thereby implementing unified management of the model conversion process of a multi-AI device platform (a platform of AI devices mixed with different application scenarios);
still further, when the model description information for describing the second deep learning model is embodied in the target deep learning model, the target deep learning model may be subsequently upgraded based on the model description information, and the like.
It should be noted that, in step 201, the model transformation information may further include: and performing optimization identification of optimization processing on the first deep learning model. Optionally, the optimization process identifier here may be: an indication of the identity of the model to be optimized, such as quantized and/or compressed. When the optimization processing identifier is obtained, the optimization processing corresponding to the optimization processing identifier may be executed in the process of converting the first deep learning model in step 202, so as to improve the performance of the finally obtained second deep learning model. Taking the optimization identifier as the quantization identifier, in the step 202, quantization processing may be further performed in the process of converting the first deep learning model, so as to reduce the storage occupation space of the finally obtained second deep learning model; the model type of the second deep learning model finally obtained at this time may be: and the quantization type is used for indicating that the second deep learning model is a quantization model.
It should be noted that, in this embodiment, in the process of converting the first deep learning model in step 202, the first deep learning model may be further optimized according to the AI device information of the target AI device, so as to improve the calculation performance of the second deep learning model obtained after the conversion. The optimization process here may include: the method comprises the steps of converting a topological graph of a first deep learning model into a structure diagram which can be identified by target AI equipment, optimizing a calculation diagram of the first deep learning model, converting a data dimension format of the first deep learning model (the weight data format of a second deep learning model obtained after conversion is nchw, nhwc, nchw _ vec _ c and the like), and performing optimization processing such as off-line generation of a calculation instruction.
Based on this, the model parameters of the second deep learning model in step 203 may include: and (5) corresponding parameters after the optimization processing. Such as: a model weight data format corresponding to the second deep learning model; the model type of the second deep learning model at least comprises: a quantization type for indicating that the second deep learning model is a quantization model; and when the model type of the second deep learning model is the quantization type, the second deep learning model corresponds to the quantization parameter. Optionally, the quantization parameter herein may include: a quantization bit number (for example, 8bit, 4bit, 1bit, etc.), a quantization algorithm, etc., and the present embodiment is not particularly limited.
An example of how to generate the target deep learning model of the first deep learning model from the second deep learning model and the model description information for describing the second deep learning model in step 203 is described below:
referring to fig. 3, fig. 3 is a flowchart of an implementation of step 203 provided in an embodiment of the present application. As shown in fig. 3, the process may include the following steps:
step 301, encapsulating the model description information according to a specified model management protocol to obtain model encapsulation information.
Here, the model management protocol may be specified according to actual requirements, and the embodiment is not particularly limited.
In order to facilitate tracing the target deep learning model, as an embodiment, the step 301 may further encapsulate the model transformation information. Based on this, the step 301 may include: and encapsulating the model description information and the model conversion information according to a specified model management protocol to obtain the model encapsulation information. Of course, in order to save the packaging resources, the first depth model in the model transformation information may also be excluded in the step 301 during packaging. That is, step 301 may include: and encapsulating the model description information and the information except the first depth model in the model conversion information according to a specified model management protocol to obtain the model encapsulation information.
And 302, loading the model packaging information at a specified position of the second deep learning model.
In an example, the specified position here may be a model protocol header position of the second deep learning model, an end position of the second deep learning model, or the like, and the embodiment is not particularly limited.
As an extension, the model encapsulation information can be represented by json or xml.
And 303, packaging the second deep learning model loaded with the model packaging information at the specified position to obtain the target deep learning model.
Finally, through the process shown in fig. 3, the generation of the target deep learning model of the first deep learning model according to the second deep learning model and the model description information for describing the second deep learning model is realized.
In one example, after the target deep learning model is generated, the target deep learning model may be encrypted according to a set encryption algorithm and the encrypted target deep learning model may be stored in order to improve the security of the target deep learning model. Here, the set encryption algorithm may be set according to actual requirements, and may be, for example, symmetric encryption, asymmetric encryption, unidirectional encryption, MD5, and the like, and the embodiment is not particularly limited.
Then, when needed, the user can derive the stored encrypted target deep learning model, and decrypt the encrypted target deep learning model by means of a decryption tool. The model description information and the like may be checked based on the decrypted target deep learning model, and whether to deploy the target deep learning model, whether to upgrade the target deep learning model, and the like may be determined based on the checked information, which is not specifically limited in this embodiment.
Thus, the description of the method provided in this embodiment is completed.
The following describes the system and apparatus provided in this embodiment:
referring to fig. 4, fig. 4 is a system structure diagram provided in the present embodiment. As shown in fig. 4, the system may include: client 401, network model management server 402, N AI devices 403, N being greater than or equal to 1.
In one example, the client 401 faces the client, and the user sends the model conversion information of the first deep learning model to be converted to the network model management server 402 through the control client 401, so that the network model management server 402 obtains the model conversion information of the first deep learning model to be converted. The model conversion information is as described above and will not be described in detail here.
In one example, the network model management server 402 is connected between the client 401 and the AI device 403. Alternatively, the client 401 and the AI device 403 may be connected to the network model management server 102 via a network.
In one example, the AI device 403 may be an AI device applied to different application scenarios, such as cloud, end-side, edge-side, and so on. Optionally, the AI device applied to the cloud may be denoted as a cloud AI device. The AI device applied to the terminal side may be referred to as a terminal side AI device. The AI device applied to the network edge may be denoted as a network edge AI device.
Network model management server 402 may perform the process shown in fig. 2, specifically: obtaining, from the client 401, model conversion information of a first deep learning model to be converted, where the model conversion information at least includes: a first deep learning model, and AI device information of a target AI device; the target AI device is the AI device to be deployed with the first deep learning model in the N AI devices, and the first deep learning model is not matched with the target AI device; and the number of the first and second groups,
converting the first deep learning model to obtain a second deep learning model matched with the AI equipment information of the target AI equipment; and generating a target deep learning model of the first deep learning model according to the second deep learning model and the model description information for describing the second deep learning model.
Optionally, the model description information at least includes: software version information for operating the second deep learning model for deep learning reasoning and/or model parameters of the second deep learning model
In one example, the network model management server 402 converting the first deep learning model may include: and forwarding the model conversion information to the target AI device corresponding to the AI device information, so that the target AI device converts the first deep learning model through the configured model conversion tool.
In one example, the network model management server 402 generating the target deep learning model of the first deep learning model according to the second deep learning model and the model description information for describing the second deep learning model includes:
packaging the model description information according to a specified model management protocol to obtain model packaging information;
loading the model packaging information at a specified position of the second deep learning model;
and packaging the second deep learning model loaded with the model packaging information at the specified position to obtain the target deep learning model.
In one example, the network model management server 402 packages the model description information according to a specified model management protocol to obtain model package information, which includes:
encapsulating the model description information and the model conversion information according to a specified model management protocol to obtain model encapsulation information; alternatively, the first and second electrodes may be,
and encapsulating the model description information and the information except the first depth model in the model conversion information according to a specified model management protocol to obtain the model encapsulation information.
In one example, the software version information includes at least one of the following information:
the lowest version information of the inference software stack is used for carrying out deep learning inference by utilizing the second deep learning model, and the type of the inference software stack is used for carrying out deep learning inference by utilizing the second deep learning model.
In one example, the model parameters include at least one of the following parameters:
a model weight data format corresponding to the second deep learning model;
a model type of the second deep learning model; the model types include at least: a quantization type for indicating that the second deep learning model is a quantization model;
and when the model type of the second deep learning model is a quantization type, the second deep learning model corresponds to a quantization parameter.
In one example, network model management server 402 may further include:
and encrypting the target deep learning model according to a set encryption algorithm and storing the encrypted target deep learning model.
The system provided by the embodiment of the present application is described above. The following describes the apparatus provided in the embodiments of the present application:
referring to fig. 5, fig. 5 is a structural diagram of an apparatus provided in an embodiment of the present application. The device is applied to a network model management server and can comprise:
an obtaining unit, configured to obtain model conversion information of a first deep learning model to be converted, where the model conversion information at least includes: a first deep learning model, and AI device information of a target AI device; the target AI device is an AI device to be deployed with the first deep learning model, and the first deep learning model is not matched with the target AI device;
the first processing unit is used for converting the first deep learning model to obtain a second deep learning model matched with the target AI device;
and the second processing unit is used for generating a target deep learning model of the first deep learning model according to the second deep learning model and model description information for describing the second deep learning model.
Optionally, the model description information at least includes: and the software version information is used for operating the second deep learning model to carry out deep learning inference, and/or the model parameters of the second deep learning model.
As one embodiment, the first processing unit converting the first deep learning model includes:
and forwarding the model conversion information to the target AI device corresponding to the AI device information, so that the target AI device converts the first deep learning model through the configured model conversion tool.
As an embodiment, the second processing unit generating the target deep learning model of the first deep learning model from a second deep learning model and model description information for describing the second deep learning model includes:
packaging the model description information according to a specified model management protocol to obtain model packaging information;
loading the model packaging information at a specified position of the second deep learning model;
and packaging the second deep learning model loaded with the model packaging information at the specified position to obtain the target deep learning model.
Optionally, the encapsulating, by the second processing unit, the model description information according to a specified model management protocol to obtain model encapsulation information, including:
encapsulating the model description information and the model conversion information according to a specified model management protocol to obtain model encapsulation information; alternatively, the first and second electrodes may be,
and encapsulating the model description information and the information except the first depth model in the model conversion information according to a specified model management protocol to obtain the model encapsulation information.
As an embodiment, the software version information includes at least one of the following information:
the lowest version information of the inference software stack is used for carrying out deep learning inference by utilizing the second deep learning model, and the type of the inference software stack is used for carrying out deep learning inference by utilizing the second deep learning model.
As an embodiment, the model parameters comprise at least one of the following parameters:
a model weight data format corresponding to the second deep learning model;
a model type of the second deep learning model; the model types include at least: a quantization type for indicating that the second deep learning model is a quantization model;
and when the model type of the second deep learning model is a quantization type, the second deep learning model corresponds to a quantization parameter.
Optionally, the second processing unit may further encrypt the target deep learning model according to a set encryption algorithm and store the encrypted target deep learning model.
Thus, the description of the structure of the apparatus shown in fig. 5 is completed.
Correspondingly, the application also provides a hardware structure of the device shown in fig. 5. Referring to fig. 6, the hardware structure may include: a processor and a machine-readable storage medium having stored thereon machine-executable instructions executable by the processor; the processor is configured to execute machine-executable instructions to implement the methods disclosed in the above examples of the present application.
Based on the same application concept as the method, embodiments of the present application further provide a machine-readable storage medium, where several computer instructions are stored, and when the computer instructions are executed by a processor, the method disclosed in the above example of the present application can be implemented.
The machine-readable storage medium may be, for example, any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A management method applied to model transformation, the method comprising:
obtaining model conversion information of a first deep learning model to be converted, wherein the model conversion information at least comprises: a first deep learning model, and AI device information of a target AI device; the target AI device is an AI device to be deployed with the first deep learning model, and the first deep learning model is not matched with the target AI device;
converting the first deep learning model to obtain a second deep learning model matched with the AI equipment information of the target AI equipment;
and generating a target deep learning model of the first deep learning model according to the second deep learning model and the model description information for describing the second deep learning model.
2. The method of claim 1, wherein the transforming the first deep learning model comprises:
and forwarding the model conversion information to the target AI device corresponding to the AI device information, so that the target AI device converts the first deep learning model through the configured model conversion tool.
3. The method of claim 1, wherein generating the target deep learning model of the first deep learning model from a second deep learning model and model description information describing the second deep learning model comprises:
packaging the model description information according to a specified model management protocol to obtain model packaging information;
loading the model packaging information at a specified position of the second deep learning model;
and packaging the second deep learning model loaded with the model packaging information at the specified position to obtain the target deep learning model.
4. The method of claim 3, wherein encapsulating the model description information according to a specified model management protocol yields model encapsulation information, comprising:
encapsulating the model description information and the model conversion information according to a specified model management protocol to obtain model encapsulation information; alternatively, the first and second electrodes may be,
and encapsulating the model description information and the information except the first depth model in the model conversion information according to a specified model management protocol to obtain the model encapsulation information.
5. The method of claim 1, wherein the software version information comprises at least one of:
the lowest version information of the inference software stack is used for carrying out deep learning inference by utilizing the second deep learning model, and the type of the inference software stack is used for carrying out deep learning inference by utilizing the second deep learning model.
6. The method according to claim 1, characterized in that the model parameters comprise at least one of the following parameters:
a model weight data format corresponding to the second deep learning model;
a model type of the second deep learning model; the model types include at least: a quantization type for indicating that the second deep learning model is a quantization model;
and when the model type of the second deep learning model is a quantization type, the second deep learning model corresponds to a quantization parameter.
7. The method of any one of claims 1 to 6, further comprising:
and encrypting the target deep learning model according to a set encryption algorithm and storing the encrypted target deep learning model.
8. A management system for model transformation, the system comprising:
the system comprises a client, a network model management server and N AI devices; the N AI devices comprise at least one cloud AI device, and/or at least one terminal side AI device, and/or at least one network edge side AI device; n is greater than or equal to 1; the client and the N AI devices are respectively connected with the network model management server through a network;
the network model management server obtains model conversion information of a first deep learning model to be converted from a client, wherein the model conversion information at least comprises: a first deep learning model, and AI device information of a target AI device; the target AI device is the AI device to be deployed with the first deep learning model in the N AI devices, and the first deep learning model is not matched with the target AI device; and
and converting the first deep learning model to obtain a second deep learning model matched with the AI equipment information of the target AI equipment, and generating the target deep learning model of the first deep learning model according to the second deep learning model and the model description information for describing the second deep learning model.
9. A management device applied to model conversion is characterized in that the device is applied to a network model management server and comprises:
an obtaining unit, configured to obtain model conversion information of a first deep learning model to be converted, where the model conversion information at least includes: a first deep learning model, and AI device information of a target AI device; the target AI device is an AI device to be deployed with the first deep learning model, and the first deep learning model is not matched with the target AI device;
the first processing unit is used for converting the first deep learning model to obtain a second deep learning model matched with the target AI device;
and the second processing unit is used for generating a target deep learning model of the first deep learning model according to the second deep learning model and model description information for describing the second deep learning model.
10. An electronic device, comprising: a processor and a machine-readable storage medium;
the machine-readable storage medium stores machine-executable instructions executable by the processor;
the processor is configured to execute machine executable instructions to implement the method steps of any of claims 1-7.
CN202010682147.9A 2020-07-15 2020-07-15 Management method, system and device applied to model conversion and electronic equipment Pending CN113947207A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010682147.9A CN113947207A (en) 2020-07-15 2020-07-15 Management method, system and device applied to model conversion and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010682147.9A CN113947207A (en) 2020-07-15 2020-07-15 Management method, system and device applied to model conversion and electronic equipment

Publications (1)

Publication Number Publication Date
CN113947207A true CN113947207A (en) 2022-01-18

Family

ID=79326233

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010682147.9A Pending CN113947207A (en) 2020-07-15 2020-07-15 Management method, system and device applied to model conversion and electronic equipment

Country Status (1)

Country Link
CN (1) CN113947207A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114611714A (en) * 2022-05-11 2022-06-10 成都数之联科技股份有限公司 Model processing method, device, system, electronic equipment and storage medium
CN115878096A (en) * 2023-01-31 2023-03-31 北京面壁智能科技有限责任公司 Deep learning model unified application method, device, server and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114611714A (en) * 2022-05-11 2022-06-10 成都数之联科技股份有限公司 Model processing method, device, system, electronic equipment and storage medium
CN114611714B (en) * 2022-05-11 2022-09-02 成都数之联科技股份有限公司 Model processing method, device, system, electronic equipment and storage medium
CN115878096A (en) * 2023-01-31 2023-03-31 北京面壁智能科技有限责任公司 Deep learning model unified application method, device, server and storage medium

Similar Documents

Publication Publication Date Title
CN108197891B (en) Electronic signing device and method based on block chain
CN103177222B (en) A kind of file adds shell, the disposal route of shelling and equipment thereof
CN113947207A (en) Management method, system and device applied to model conversion and electronic equipment
CN103946856A (en) Encryption and decryption process method, apparatus and device
CN104601681A (en) File fragmentation processing method and device
CN108111622A (en) A kind of method, apparatus and system for downloading whitepack library file
US11823060B2 (en) Method and system for performing deterministic data processing through artificial intelligence
KR102272928B1 (en) Operating method for machine learning model using encrypted data and apparatus based on machine learning model
CN110738395A (en) service data processing method and device
CN112785303A (en) Verification processing method and verification processing system based on block chain offline payment
CN111488277A (en) Node matching method, device, equipment and system
CN102779048A (en) Method and device for operating hypertext markup language5 (HTML5) application program at mobile terminal
CN109921919A (en) Data exchange system and method
KR101593675B1 (en) User data integrity verification method and apparatus
CN111311261B (en) Safe processing method, device and system for online transaction
US11394541B2 (en) Method for the generation of personalized profile packages in integrated circuit cards, corresponding system and computer program product
CN108241626A (en) The generation method and device of query script
CN113672955B (en) Data processing method, system and device
CN113794581B (en) Distributed CP unified deployment method, network equipment and storage medium
CN114611129A (en) Data privacy protection method and system
CN112311536B (en) Key hierarchical management method and system
CN104753902A (en) Service system verification method and device
CN110213306B (en) Wind generating set starting control method and device
CN106789023B (en) DH algorithm negotiation method and device based on IKEv2
CN111241173A (en) Method and system for data interaction among multiple systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination