CN114861860A - Deep learning model processing method and device and electronic equipment - Google Patents

Deep learning model processing method and device and electronic equipment Download PDF

Info

Publication number
CN114861860A
CN114861860A CN202110157121.7A CN202110157121A CN114861860A CN 114861860 A CN114861860 A CN 114861860A CN 202110157121 A CN202110157121 A CN 202110157121A CN 114861860 A CN114861860 A CN 114861860A
Authority
CN
China
Prior art keywords
node
deep learning
model
information
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110157121.7A
Other languages
Chinese (zh)
Inventor
魏可鑫
董鑫
林志强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202110157121.7A priority Critical patent/CN114861860A/en
Publication of CN114861860A publication Critical patent/CN114861860A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/34Graphical or visual programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The application provides a processing method and device of a deep learning model and electronic equipment, which are applied to the technical field of artificial intelligence, can realize visual creation or modification of the model by a user through an intermediate format file, conveniently and accurately generate a model training code, and realize one-time development and adaptation to various deep learning frameworks. The method comprises the steps that in response to a first operation that a user selects at least one first control from a displayed model editing interface, connecting lines between at least one node and the at least one node are displayed on the model editing interface, the at least one node corresponds to the at least one first control in a one-to-one mode, each node is provided with node information, and the connecting lines of the at least one node are used for indicating the connecting relation between the at least one node; and generating a first target file based on the node information of at least one node and the connection relation between the at least one node, wherein the first target file comprises the structural information and the graphical information of the first deep learning model.

Description

Deep learning model processing method and device and electronic equipment
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for processing a deep learning model, and an electronic device.
Background
With the wide application of deep learning models such as Neural Network models (NN models), people have higher and higher requirements for intellectualization of the processing process of the deep learning models, for example, higher and higher requirements for visualization of the processing process of the deep learning models.
At present, the visualization mode of the deep learning model mainly has two main types. In the first mode, the visualization tool of the deep learning model can only read and view the deep learning model, and does not have the function of editing (such as creating or modifying). For example, the most representative tool is netron, which can directly read the original deep learning model format and perform visual display, and supported formats are ONNX, Keras, Core ML, Caffe2, Darknet, MXNet, ncnn, tensrflow Lite, but have no editing function. In the second type of method, although the visualization tool for the deep learning model can perform visualization arrangement of the model, so that a technician creates the deep learning model using less code or zero code, such tools all provide services in the form of cloud services, i.e., communication connection must be established with one or more devices providing the cloud services, thereby causing the requirement of the model arrangement process on communication function to be high. For example, the tool in the second type of manner may be a tool provided by microsoft, amazon, etc. for visually arranging deep learning models.
In addition, the deep learning frameworks of the current mainstream, such as tensierflow, pyrrch, MxNet, paddlefold, mindspore, etc., all store the deep learning model in protobuf format. Protobuf is a platform-independent, language-independent, efficient exchange format designed to achieve smaller memory size, faster serialization and deserialization times, independent of model visualization. If a user needs to check and modify the deep learning model stored by protobuf, the neural network model can only be visually checked by using a netron tool alone, and then codes of the neural learning model are called out alone to be modified, so that the process of checking and modifying the neural learning model is complicated.
Disclosure of Invention
The embodiment of the application provides a processing method and device for a deep learning model and electronic equipment, which can realize visual creation or modification of the model by a user through an intermediate format file, conveniently and accurately generate a model training code, and realize one-time development and adaptation to various deep learning frameworks.
In a first aspect, an embodiment of the present application provides a processing method of a deep learning model, which is applied to an electronic device, and includes: receiving a first operation of selecting at least one first control from a displayed model editing interface by a user; responding to the first operation, displaying at least one node and a connecting line between the at least one node on the model editing interface, wherein the at least one node corresponds to the at least one first control in a one-to-one mode, each node has node information, and the connecting line of the at least one node is used for indicating the connecting relation between the at least one node; and generating a first target file based on the node information of at least one node and the connection relation between the at least one node, wherein the first target file comprises the structural information and the graphical information of the first deep learning model. It is understood that the Model editing interface is an interface of a neural network Model (NN Model) graphical layout tool (or application layout designer) hereinafter. The at least one first control selects one or more nodes (or node templates) displayed in the area 101a for the nodes in the UI interface 11 hereinafter. The at least one node is a node added to the graphic editing area (or canvas) 101c in the UI interface 11 by the user. The first target file is an intermediate format file in the following description, and is between the model imaging and the model training code. Specifically, the above-mentioned "displaying at least one node and a connecting line between at least one node on the model editing interface in response to the first operation" can be realized by the following steps 301-304. The above-described first operation may include not only the selection operation by the user of the plurality of nodes in the node selection area 101a in the UI interface 11 in step 302, but also the opening operation in step 301, the setting operation by the user of the plurality of nodes in the graphic editing area 101c in the UI interface 11 in step 303, and the control operation (i.e., the operation of generating the connecting line) by the user of the plurality of nodes in the graphic editing area 101c in the UI interface 11 in step 304. Further, the above-mentioned "generating the first object file based on the node information of the at least one node and the connection relationship between the at least one node" may be implemented by step 305 below. Therefore, the intermediate format file is decoupled from the specific deep learning frame, so that the structure of the model can be accurately expressed, and graphical information of the model, namely User Interface (UI) information, can be accurately expressed.
In one possible implementation of the first aspect, each node information includes at least one of the following information: the functional parameters, the graphic information, the input and output limitation information of the corresponding operators and codes corresponding to at least one deep learning frame; each deep learning framework is used for generating model building codes in a code format corresponding to the first deep learning model. For example, the node information of one node may include: 1, name of the node, i.e. type of the node. 2, the shape of the figure, such as rectangle or circle, etc.; 3, attributes of the graphic, such as a display color of the graphic, a display size of the graphic (such as a height and a width of a graphic frame), and a display position (i.e., coordinates) of the graphic in the canvas; 4, type of corresponding operator, such as whether the operator is a convolution operator "Conv 2D" operator or a "Relu operator" of an activation operator; 5, parameters of operators and parameter types; 6, inputting and outputting limit information; and 7, codes corresponding to different deep learning frames. The items 1 to 3 of information may be graph information of the nodes, and the items 4 to 6 of information may be functional parameters of the nodes. And the input and output limits of the nodes are used for ensuring the structural integrity of the deep learning model formed by the nodes. For example, an input-output limit for a node may indicate a set of types for input nodes for the node, and/or a set of types for output nodes for the node. Wherein, since each node has node information, the first object file may be generated based on the node information of at least one node.
In a possible implementation of the first aspect, the method further includes: and generating a first model creating code for the first target file based on a target deep learning frame in response to a second operation of the user, wherein the target deep learning frame is one of at least one deep learning frame, and each deep learning frame is used for constructing a model of a code format corresponding to the generated first deep learning model. For example, the second operation may be an operation of the functionality control 216 of the functionality selection area 101b in the UI interface 11 by the user in step 306 below. It will be appreciated that the target deep learning framework may be a default deep learning framework in the application programming designer, infra, or a deep learning framework selected by the user from at least one deep learning framework.
In a possible implementation of the first aspect, the method further includes: responding to a third operation of the user, saving the first target file, and canceling the display of at least one node and a connecting line between the at least one node on the model editing interface; and responding to a fourth operation of the user, and redisplaying the at least one node and the connecting line between the at least one node on the model editing interface according to the first target file. For example, the fourth operation may be an operation, such as a click operation, by the user on the function control 216 in the function selection area 101b in the UI interface 11. And the intermediate format file can be restored into a visual graph of the first deep learning model, so that the user can conveniently view the deep learning model visually.
In a possible implementation of the first aspect, the method further includes: and in response to a fifth operation of the user, updating the at least one node and the connecting line of the at least one node on the model editing interface, and updating the first target file. And the visual graph of the first deep learning model can be recovered from the intermediate format file, so that the deep learning model can be modified visually by a user conveniently without manually writing codes by the user.
In a possible implementation of the first aspect, the displaying, on the model editing interface, the at least one node and the connecting line between the at least one node in response to the first operation includes: in response to the first operation, displaying at least one node and a connecting line between the at least one node on the model editing interface under the condition that the constraint condition is met; under the condition that the constraint condition is not met, displaying target prompt information on a model editing interface; wherein the constraints comprise at least one of: the sub-constraint condition of the functional parameter of the corresponding operator of the node, the sub-constraint condition of the graphic information of the node, and the sub-constraint condition of the input and the output of the node; the target prompt message is used for prompting the node information of at least one node and/or the connection relation between at least one node to generate errors. It can be understood that the difference from direct code writing by users is that strict constraint check is performed on model building codes obtained by converting the intermediate format file generated based on the constraint condition, so that the legality and validity of the model codes are ensured.
In a possible implementation of the first aspect, the method further includes: responding to a sixth operation of the user, adding at least one extension control in the model editing interface, wherein each extension control corresponds to one node, and each extension control is used for responding to the operation of the user and triggering the corresponding node to be displayed on the model editing interface. For example, the sixth operation may be an operation of the control 213 of the node selection area 101a in the interface 10 by the user in the following step 308, and the at least one extension control may be one or more extension meta-nodes in the following step 308. Therefore, the embodiment of the application provides an extension mechanism of the node, and the problem of expansibility of a deep learning framework operator is further solved. When an operator is newly added, the operator can be supported as long as the template file corresponding to the operator is added, one-time development is realized, and multiple deep learning frameworks are supported. The method comprises the following steps: the receiving module (for example, implemented by an input device such as a touch screen or a mouse in the electronic device) is used for receiving a first operation of selecting at least one first control from the displayed model editing interface by a user; a display module (e.g., implemented by a screen in an electronic device) configured to display, on the model editing interface, at least one node and a connection line between the at least one node in one-to-one correspondence with the at least one first control in response to the first operation received by the receiving module, where each node has node information, and the connection line of the at least one node is used to indicate a connection relationship between the at least one node; a first generating module (e.g., implemented by a processor in an electronic device) configured to generate a first target file based on the node information of the at least one node and the connection relationship between the at least one node, where the first target file includes structural information and graphical information of the first deep learning model.
In one possible implementation of the second aspect, each node information includes at least one of the following information: the functional parameters, the graphic information, the input and output limitation information of the corresponding operators and codes corresponding to at least one deep learning frame; wherein each deep learning framework is used for generating model building codes in a code format corresponding to the first deep learning model.
In a possible implementation of the second aspect, the apparatus further includes: and the second generation module is used for generating a first model creation code for the first target file generated by the first generation module based on a target deep learning frame in response to a second operation of the user, wherein the target deep learning frame is one of at least one deep learning frame, and each deep learning frame is used for generating a model construction code in a code format corresponding to the first deep learning model.
In a possible implementation of the second aspect, the apparatus further includes: a saving module (e.g., implemented by a memory in an electronic device) for saving the first object file generated by the first generating module and canceling a display of a connection line between the at least one node and the at least one node on the model editing interface in response to a third operation by a user; and the display module is further used for responding to a fourth operation of the user and displaying the at least one node and the connecting line between the at least one node on the model editing interface again according to the first target file.
In a possible implementation of the second aspect, the apparatus further includes: an updating module (e.g., implemented by a processor in an electronic device) is configured to update the at least one node and the connecting line of the at least one node on the model editing interface and update the first target file in response to a fifth operation by the user.
In a possible implementation of the second aspect, the first generating module is specifically configured to, in response to the first operation, display at least one node and a connection line between the at least one node on the model editing interface if a constraint condition is satisfied; under the condition that the constraint condition is not met, displaying target prompt information on the model editing interface; wherein the constraints comprise at least one of: the sub-constraint condition of the functional parameter of the corresponding operator of the node, the sub-constraint condition of the graphic information of the node, and the sub-constraint condition of the input and the output of the node; the target prompt message is used for prompting node information of the at least one node and/or errors of connection relations among the at least one node.
In a possible implementation of the second aspect, the apparatus further includes: and an adding module (for example, implemented by a processor in the electronic device) configured to add, in response to a sixth operation by the user, at least one extension control in the model editing interface, where each extension control corresponds to one node, and each extension control is configured to trigger, in response to the operation by the user, the corresponding node to be displayed on the model editing interface.
In a third aspect, an embodiment of the present application provides a readable medium, where instructions are stored on the readable medium, and when executed on an electronic device, the instructions cause the electronic device to perform the processing method of the deep learning model according to the first aspect.
In a fourth aspect, an embodiment of the present application provides an electronic device, including: a memory for storing instructions to be executed by one or more processors of an electronic device, and a processor, which is one of the processors of the electronic device, for executing the instructions stored in the memory to implement the processing method of the deep learning model according to the first aspect.
Drawings
FIG. 1 illustrates a block diagram of a system architecture to which a method of processing a deep learning model is applied, according to some embodiments of the present application;
FIG. 2 illustrates a UI interface diagram to which a method of processing a deep learning model is applied, according to some embodiments of the present application;
FIG. 3 illustrates a flow diagram of a method of processing a deep learning model, according to some embodiments of the present application;
FIG. 4 illustrates a flow diagram of a method of processing a deep learning model, according to some embodiments of the present application;
FIG. 5 illustrates a UI interface diagram to which a method of processing a deep learning model is applied, according to some embodiments of the present application;
FIG. 6 illustrates a flow diagram of a method of processing a deep learning model, according to some embodiments of the present application;
FIG. 7 illustrates a schematic structural diagram of an electronic device, according to some embodiments of the present application;
FIG. 8 illustrates a block diagram of a software architecture of an electronic device, according to some embodiments of the present application.
Detailed Description
Illustrative embodiments of the present application include, but are not limited to, a method, medium, and electronic device for processing a deep learning model.
According to the processing method of the deep learning Model, the neural deep learning Model can be created by adopting a visual graph through a neural network Model (NN Model) graphical arrangement tool, and the graphical deep learning Model is stored as an intermediate format file instead of directly generating a specific neural network Model training code. The intermediate format file is between the model imaging and the model training code and is decoupled from a specific deep learning framework. Specifically, the intermediate format file can accurately express the structure of the model, so that one or more model training codes (or called construction codes) can be completely generated through the tool, one-time development is further realized, and the model training code is adaptive to various deep learning frameworks. In addition, the intermediate format file can also accurately express graphical information of the model, namely User Interface (UI) information, and then the intermediate format file can be restored to a visual graph of the deep learning model by the tool, so that a User can conveniently visually check or modify the deep learning model without manually writing codes. Meanwhile, the tool provides a strict checking mechanism for the intermediate format file, so that compared with the training code of the direct writing model, the training code of the deep learning model obtained by converting the intermediate format file reduces the error probability of the code. Therefore, visual creation or modification of a user can be realized through the intermediate format file, and a model training code can be conveniently and accurately generated, namely, one-time development and adaptation of various deep learning frameworks are realized.
It can be understood that the processing method of the deep learning model provided by the present application is applied to a scenario of an Integrated Development Environment (IDE), and particularly, is applied to a model processing scenario based on a UI interface.
It should be noted that, in the Processing method of the deep learning model provided in the embodiment of the present application, the execution subject may be an electronic device, or a Central Processing Unit (CPU) of the electronic device, or a control module or device (or called a Processing device of the deep learning model) in the electronic device for executing the Processing of the deep learning model. In the following embodiments, a processing method of a deep learning model provided in the embodiments of the present application is described with an electronic device as an execution subject.
It is understood that electronic devices suitable for use in the present application may include, but are not limited to: mobile phones, tablet computers, video cameras, desktop computers, laptop computers, handheld computers, notebook computers, desktop computers, ultra-mobile personal computers (UMPC), netbooks, and cellular phones, Personal Digital Assistants (PDA), Augmented Reality (AR) \\ Virtual Reality (VR) devices, media players, smart televisions, smart speakers, smart watches, and the like.
Embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Referring to fig. 1, a system architecture block diagram applied by the processing method of the deep learning model provided in the embodiment of the present application is shown. As shown in fig. 1, the system 10 includes a UI interface 11, an intermediate format storage module 12, a Mindspore or other model building code module 13, a model training module 14, a model storage module 15, and an extended node definition module 16.
Specifically, the system 10 may be implemented by a main body (e.g., an electronic device) of the processing method for executing the deep learning model in the embodiment of the present application.
The UI interface 11 is a graphical interface that supports a user to complete the construction and/or modification of a Neural Network (NN) model (i.e., a deep learning model) through manual dragging and other operations. Specifically, the UI interface 11 includes predefined nodes (i.e., graphical model nodes) therein for enabling a user to drag and connect on the UI interface 11 to create or modify the deep learning model.
And the intermediate format storage module 12 is used for saving the deep learning model created or modified based on the user operation in the UI interface 11 as an intermediate format file, and the intermediate format file is independent of the specific deep learning frame format. A Mindspore or other model building code module 13, configured to convert the intermediate format file stored in the intermediate format storage module 12 and corresponding to the deep learning model into a model building code in one or more formats, such as a model building code of a Mindspore deep learning framework or other deep learning framework. The learning framework adopted by the code construction is not limited in the embodiment of the application, and can be set according to actual requirements.
And the model training module 14 is used for training the model building code generated by the Mindspore or other model building code module 13 to obtain a trained model training code.
And a model storage module 15, configured to store the model training codes obtained by the training of the model training module 14, for example, the model training codes are stored in a Protobuf format.
An extended node definition module 16, configured to support a node extension mechanism of the system 10, specifically, to enable a user to customize an extended node through the UI interface 11, and further create or modify a deep learning model through the predefined node and the extended node in the UI interface 11.
It should be noted that the deep learning model may include a plurality of operators (or called "operators"). In a broad sense, an operation on any function can be regarded as an operator, for example, the power of the solution and the power of the evolution can be regarded as an operator. In particular, operators can be seen as mappings, relationships, or transformations, such as a mapping from one function space (e.g., Banach space, Hilbert space, Sobolev space) to another function space.
In some possible embodiments, the operators in the deep learning model provided herein may include the following examples:
1. an initial input operator for inputting DATA as a first layer of input DATA (DATA) for the deep learning model. For example, a placeholder operator, where placeholder is a placeholder, is assigned a specific value when the operator is executed. In particular, the placeholder operator may be used to default to displaying gray text when no content is entered in the text entry box.
2. And activating an operator for introducing nonlinear content in the deep learning model, wherein the nonlinear content is usually a region needing to be distinguished, such as an image of the region needing to be distinguished. For example, the activation operator includes at least the following examples:
1) relu operator, where Relu function stands for "modified linear unit", which is the maximum function (x, o) of the input matrix x with the convolved image. The ReLu function sets all negative values in matrix x to zero, with the remaining values unchanged. The calculation of the ReLu function is performed after convolution (i.e., convolution).
2) The LeakyRelu operator, which is used to assign a non-zero slope to all negative values in the input matrix x.
3) The Sigmoid operator can accelerate the convergence rate in the deep learning model training process, and particularly, the Sigmoid function can compress the input numerical value to the range of [0,1 ].
3. The Tensor operator is actually a multidimensional array (multidimensional array) in deep learning, and the purpose of the Tensor is to create higher-dimensional matrixes and vectors. For example, the Tensor operator includes the following examples:
1) plus, -x,/, i.e. Add, Sub, Mul, RealDiv, i.e. operators corresponding to the four algorithms of adding, subtracting, multiplying, dividing, wherein "+" is the multiplication sign.
2) The BiasAdd operator, which is used to add bias terms, is typically a one-dimensional Tensor. The bias unit (bias unit), also called bias term (bias term) or intercept term (intercept term), is the intercept of the function. For example, the parameters of the neural network may be expressed as: (W, b), wherein W represents a parameter matrix and b represents a bias term or an intercept term.
3) MatMul operator, which is a matrix multiplier.
4. Feature extraction operators, which may include the following examples:
1) the Convolution operator is used for extracting characteristic values of input (such as an input image) in the deep learning model in a partitioned mode. For example, the Conv1D (one-dimensional convolution) operator, for processing time-series signals or text, for example, for performing text classification, acoustic models for speech recognition, time-series prediction, etc. scenarios. Conv2D (two-dimensional convolution) operator for processing images, time-frequency characterization (speech and audio), etc., applied to object detection, localization, and recognition scenes. Conv3D (three-dimensional convolution) operator for processing video, stereo images, tomographic images, etc., for application in video recognition or understanding, biomedical image analysis, etc.
2) The transpose convolution operator (transpose convolution) is used to restore the size of the output data (output size) to the size of the input data (input size) and keep the connection mode the same. In particular, if the convolution operation is converted to a matrix multiplication form, the transposed convolution is actually a transposition of the matrix therein, thereby producing the inverse effect.
3) The FullConnection operator, which may be considered as a special convolutional layer, or matrix multiplication, performs feature extraction using the entire input as a feature map. Usually the FC operator is followed by the Softmax operator (or function), so the function of the FC layer also includes matrix dimension transformation, which transforms the dimension into the dimension corresponding to Softmax.
4) Deconvoltation operator for mapping a low-dimensional space to a high-dimensional space while maintaining the connectivity or pattern between them.
5. The anti-overfitting operator may include the following examples:
1) a Pooling operator for dividing an input image into a number of rectangular regions and outputting a maximum or average value for each sub-region. Pooling reduces the number of parameters and prevents the over-fitting phenomenon from occurring. For example, the Pooling operator may include an Avgpool (average Pooling, or Avgpooling) operator, a max Pooling (maxpooling) operator, a Stochastic Pooling (Stochastic) operator, and the like.
The Avgpool operator can keep background information of an image, and slides in a window form (window sliding similar to convolution) on the feature map, the operation is to take an average value in a window as a result, and the feature map is subjected to down-sampling after the operation, so that the overfitting phenomenon is reduced. And the Maxpool operator can be used for extracting the characteristic texture and reducing the influence of useless information. Global pooling (Global pool) to obtain Global context. The averaging is not performed in the form of a window, but is performed in units of feature maps. I.e. a feature map outputs a value. Stochastic can be regarded as that the value of the feature map is normalized in a pooling window, the probability value normalized according to the feature map is randomly sampled and selected, namely the probability that the element value is large is also large when being selected.
2) The Mean operator is a sliding window operator with only an averaging function.
6. The Tensor dimension transform operator may include the following examples:
1) the Reshape operator is used to convert the input Tensor description into a new shape, i.e., the operator is used to transform the specified matrix into a dimension-specific matrix.
2) The Flatten operator is used for combining the dimension from start _ axis to end _ axis in the input Tensor into 1 dimension.
7. The classification operator, such as the Softmax operator, usually as the last layer of the classification network of the deep learning model, outputs the probability of each class.
More specifically, referring to fig. 1, as shown in fig. 2, a UI interface 11 for processing a deep learning model is provided according to an embodiment of the present application. The UI interface 11 shown in fig. 2 is a UI interface of an application programming designer, i.e., a UI interface of a graphical programming tool of a neural network Model (NN Model) provided in the present application.
It will be appreciated that the application orchestration designer described above may be implemented by the system 10 shown in FIG. 1.
Specifically, the UI interface 11 shown in fig. 2 includes a node selection area 101a, a function selection area 101b, a graphic editing area (or canvas) 101c, and an editing result display area 101 d.
The node selection area 101a includes a plurality of nodes of common node types, and is used for adding a corresponding graph to the graph editing area 101c through operations such as clicking and dragging by a user. And the graph editing area 101c is used for displaying a graph of the nodes added by the user and supporting the user to connect the nodes by using connecting lines so as to determine the connection relation among the nodes. For example, the user connects the respective nodes in the node selection area 101a using a connecting line having a direction to determine the connection relationship between the respective nodes, such as determining that the node connected to the starting direction of the connecting line is a previous-stage node and the node connected to the current arrival direction (i.e., arrow direction) of the connection is a next-stage node. Generally, the output of a previous-stage node is used as the input of a next-stage node.
It can be understood that, in the embodiment of the present application, the electronic device running the application layout designer may have an input device such as a mouse or a touch screen, and support a user to perform various operations on the UI interface 11, including but not limited to the click and drag operations in the above example, and the implementation of the operation of the user in the present application will not be described again, and may be set according to the actual requirements of the user.
Specifically, the node selection area 101a includes: a plurality of nodes of the common node class (denoted as common nodes), a plurality of nodes of the pooled node class (denoted as pooled nodes), and a plurality of nodes of the extended node class (denoted as extended nodes). The common nodes and the pooling nodes may be predefined nodes in the application layout designer, and the extension nodes may be user-defined nodes in the application layout designer. It will be appreciated that the nodes included in node selection area 101a are node templates provided by the application layout designer.
For example, the commonly used nodes displayed in the node selection area 101a include a node "Placeholder" 201, a node "Reshape" 202, a node "MatMul" 203, a node "Softmax" 204, a node "Add" 205, a node "Conv 2D" 206, a node "Conv 1D" 207, a node "Conv 3D" 208, a node "transported Conv" 209, and the like, which correspond to the Placeholder operator, the Reshape operator, the MatMul operator, the Softmax operator, the Add operator, the Conv2D operator, the Conv1D operator, the Conv3D operator, and the transported Conv operator, respectively, and have functions of the corresponding operators.
The pooling node includes a node "MaxPool" 210, a node "AvgPool" 211, and a node "Stochastic" 212, respectively corresponding to MaxPool operator, AvgPool operator, and Stochastic operator, and having functions of the corresponding operators. The extension node package extension metanode 1-extension metanode 3 and the like are all user-defined nodes and respectively correspond to the functions of user-defined operators. For example, the extension metanode 1 corresponds to a Mean operator and has a function of the operator.
It can be understood that, in the embodiment of the present application, the UI interface 11 shown in fig. 2 includes, but is not limited to, the nodes illustrated in the above examples, and may also include any other nodes that can be implemented, that is, nodes corresponding to any operators that form a deep learning model, and this is not particularly limited in the embodiment of the present application.
In some embodiments, the intermediate format file provided by the present application includes not only the structured information of the deep learning model, but also graphical information of the deep learning model in a graphical format. For example, the structural information of the deep learning model includes parameters of operators corresponding to each node in the model and input-output relationships (i.e., connection relationships) between each node. And the graphical information of the deep learning model may include a graph type (graph _ format), a graph parameter (graph _ para), a link (links), and the like.
It will be appreciated that the application orchestration designer provided herein defines a plurality of node templates for supporting a user in selecting one or more nodes from the node templates to create or modify any deep learning model.
As an example, the node information of one node may include functional parameters of the node, graphic information, input output limit information, and code corresponding to at least one deep learning framework. And the functional parameters of the nodes are the functional parameters of the corresponding operators.
Specifically, the node information of one node may include: 1, name of the node, i.e. type of the node. 2, the shape of the figure, such as rectangle or circle, etc.; 3, attributes of the graphic, such as a display color of the graphic, a display size of the graphic (such as a height and a width of a graphic frame), and a display position (i.e., coordinates) of the graphic in the canvas; 4, type of corresponding operator, such as whether the operator is a convolution operator "Conv 2D" operator or a "Relu operator" of an activation operator; 5, parameters of operators and parameter types; 6, inputting and outputting limit information; and 7, codes corresponding to different deep learning frames. The items 1 to 3 of information may be graph information of the nodes, and the items 4 to 6 of information may be functional parameters of the nodes. And the input and output limits of the nodes are used for ensuring the integrity of the deep learning model formed by the nodes. For example, an input-output limit for a node may indicate a set of types for input nodes for the node, and/or a set of types for output nodes for the node.
In addition, the function selection area 101b includes one or more function controls for supporting the user to process the graphical deep learning model in the UI interface 11. For example, the function control 214 in the function selection area 101b triggers saving of the intermediate format file of the deep learning model under the operation of clicking and the like of the user; the function control 215 triggers the opening of the saved graph, intermediate format file or model building code of the deep learning model under the operation of clicking and the like of the user. The function control 216 is used to trigger the generation of the graphical deep learning model from the intermediate format file into a specific model building code under the operation of clicking and the like of a user. For example, the model build code corresponding to the Mindspore deep learning framework is generated by default in the application orchestration designer, or selected by user settings as the model build code corresponding to other deep learning frameworks.
According to some embodiments of the present application, as an example, after the nodes in the node selection area 101a are added to the graph editing area (or canvas) 101c, the graph of each node may itself display an input area for the user to input the function parameters, or the user's clicking operation on the node triggers the graph editing area 101c to add an input area for displaying each function parameter of the setting node. Specifically, fig. 2 illustrates only the former as an example, and an area where the function parameter "y <10 >" is displayed on the node "Add" added to the graphic editing area 101c in fig. 2 is a function parameter input area of the node. Wherein, the node parameter "y <10 >" of the node "Add" represents the Tensor.
According to some embodiments of the application, in the case where the electronic device triggers display of the content of the intermediate format file, the electronic device may switch display of the content in the UI interface 11 as the content of the intermediate format file or switch display of the content in the graphic editing area 101c in the UI interface 11 as the content of the intermediate format file. Of course, the display position of the intermediate format file in the UI interface is not limited in the present application, and may be any other position.
As an example, the following shows an example of the contents of an intermediate format file, including: the graphic name "graph _ name" (i.e., the name of the currently created deep learning model), which takes the value of "mobilene", the information "operators" of each operator, and the connection relationship "links" of each operator.
Specifically, an example of the operator information "operators" is as follows:
1) the node name "graph _ name" is "conv 2d _ 1"; the node type "graph _ type" in the node type "graph _ format" is "rectangle" (i.e., rectangle); color "in the graphic parameter" graph _ para "is RED", the size of the box "is {" height ":200," with ":100} indicates that the size of the box in which the current node is located is 200 high and 100 wide, and the position (i.e., coordinate)" position "is the parameter {" x ":200," y ":355} indicates that the coordinate of the box in which the current node is located is (200,355); the operator type "operator _ type" is a two-dimensional convolution operator "conv 2 d"; the operator parameter "operator _ param" has a parameter "pad" of 2 and a parameter "kernel _ size" of 5. Here, kernel _ size represents the size of the convolution kernel of the two-dimensional convolution operator "conv 2 d", and pad represents the extended edge of the two-dimensional convolution operator "conv 2 d", and is 0 by default and is not extended. When the feature map is expanded, the feature map is symmetrical left and right and up and down, for example, the size of a convolution kernel is 5 x 5, then pad is set to be 2, and then four edges are expanded by 2 pixels, namely, the width and the height are expanded by 4 pixels, so that the feature map after convolution operation cannot be reduced. Furthermore, the parameters of the two-dimensional convolution operator "conv 2 d" include, but are not limited to, the above examples.
2) The node name "graph _ name" is "relu _ 1"; the node type "graph _ type" in the node type "graph _ format" is "rectangle" (i.e., rectangle); color "in the graphic parameter" graph _ para "is RED", the { "height":200, "with":100} in the size of the box "indicates that the size of the box in which the current node is located is 200 high and 100 wide, the position (i.e., coordinate)" position "is {" x ":553," y ":385} indicates that the coordinate of the box in which the current node is located is (553,355); the operator type "operator _ type" is the activation operator "relu".
In addition, the node "form" including the start direction in the connection relationship "links" is the node "conv 2d _ 1", and the node "to" in the arrival direction is the node "relu _ 1", that is, it indicates that the node "conv 2d _ 1" is the previous node of the node "relu _ 1".
It should be noted that, in some embodiments, the application layout designer of the present application may set preset conditions for specifying information such as input, output, mapping (i.e., mapping relationship), and conditions (e.g., parameter requirements) of each node.
As an example, the constraint condition includes at least one of: the sub-constraint condition of the functional parameter of the node, the sub-constraint condition of the graph information of the node, and the sub-constraint condition of the input and the output of the node.
Specifically, the constraint condition may include at least one of: 1. the method comprises the steps of (1) parameter type of a functional parameter of a node, (2) value range of the functional parameter of the node, (3) display size of a graph of the node, (4) display position of the graph of the node, (5) display shape of the graph of the node, (6) display color of an image of the node, (7) a node type set of a node at a previous level, and (8) a node type set of a node at a next level of the node. Of course, constraints include, but are not limited to, the above examples. Wherein, the items 1 and 2 are the sub-constraint conditions of the functional parameters of the nodes; items 3 to 6 are sub-constraints of the graph information of the node, and items 7 and 8 are sub-constraints of the input and output of the node.
For example, the preset condition defines the following contents for the corresponding node of the Relu operator: the input of the Relu node can be limited to the output of the node corresponding to the convolution operator, i.e. the Relu node is the previous-stage node of the convolution node.
For example, the preset condition defines the following for the node corresponding to the Softmax operator: the input of the node corresponding to the Softmax operator may be limited to the output of the node corresponding to the FC operator, and the Softmax operator is usually the last layer of the classification network of the deep learning model.
Further, the editing result display area 101d may be used to display information such as input, output, mapping relationship, parameter requirement, and the like of the current node operated or selected by the user in the graphic editing area 101 c.
In the embodiment of the application, the electronic device may generate the intermediate format file as a model creation code through the application layout designer, and then train the model creation code to obtain a training code.
In addition, the training code generated by the intermediate format file provided by the embodiment of the present application may generally include training parameters such as a training step length, and structural definitions of a plurality of operators set in the deep learning model. For example, the training code includes a training parameter having a value of 32 training steps "batch _ size" and a learning rate "learning _ rate" of 0.01.
The training code includes a structure x-self-conv 1(x) corresponding to a convolution operator set in the deep learning model, a structure x-self-relu (x) corresponding to an activation operator, and the like.
Example 1:
based on the system 10 shown in fig. 1 and the related description of the UI interface 11 shown in fig. 2, as shown in fig. 3, a flowchart of a processing method of a deep learning model provided in an embodiment of the present application is shown, where the method includes the following steps:
step 301: the electronic apparatus displays the UI interface 11 in response to an opening operation of the user to open the UI interface 11.
The operation in step 301 may arrange an opening operation of the designer for the user to open the application, such as a click operation. It is understood that the application layout designer may be an application installed in the electronic device, and the operation in step 301 may specifically be a click operation of an application icon of the application layout designer displayed on the electronic device by a user, and trigger the electronic device to open the application layout designer and further open the UI interface 11.
Step 302: the electronic apparatus adds and lays out the selected plurality of nodes in the graphic editing region 101c in response to a selection operation of the user on the plurality of nodes of the node selection region 101a in the UI interface 11.
The selection operation may be an operation in which the user clicks and drags a plurality of nodes in the node selection area 101a into the graphic editing area 101c through a mouse. For example, the node selected in step 302 may be the node "Maxmul", the node "Add", or the like selected in the graphic editing region 101c in the UI interface 11 shown in fig. 2.
Step 303: the electronic device determines, in response to a setting operation of a user on a plurality of nodes in the graphic editing region 101c in the UI interface 11, functional parameters and graphic information in node information of the plurality of nodes operated in accordance with the constraint conditions.
The setting operation may be that the user clicks an input area corresponding to each node in the graphic editing area 101c through a mouse to manually input the functional parameters and the graphic parameters of the node through a keyboard, or selects one of the displayed preset information through a mouse. For example, the functional parameter is a convolution kernel size of a convolution operator.
It is understood that, in the embodiment of the present application, the functional parameters and the graphic parameters of the node may be manually set or preset information by a user.
Step 304: the electronic equipment responds to the control operation of a user on a plurality of nodes in the graphic editing area 101c in the UI 11, displays connecting lines among the nodes on the graphic editing area 101c according to the constraint conditions, obtains the connection relation of the nodes, and obtains a graphical deep learning model.
Wherein, the connecting line between the connecting lines among the plurality of nodes is used for indicating the connection relation among the plurality of nodes.
Wherein the control operation may include at least one of: the user performs an adjustment operation of adjusting the layout of the plurality of nodes or the node information added to the graphic editing area 101c, and a deletion operation of deleting the added nodes. For example, the deletion operation may be a setting operation in which a user clicks a right mouse button on a graph of a node, and the electronic device is controlled to pop up a deletion button near the graph of the node, so that a left click operation is performed on the deletion button.
It is understood that, when a user drags a node, the electronic device may check whether the functional parameters and the graphic information of each node and the connection relation satisfy the above-mentioned constraint conditions by applying the layout designer.
According to some embodiments of the application, when the functional parameters, the graphical information or the connection relation of the nodes in the UI interface checked by the application layout designer all satisfy the constraint conditions, the electronic device may obtain the graphical deep learning model based on the nodes to generate the intermediate format file. And if the application arrangement designer checks that the functional parameters, the graphic information or the connection relation of some nodes do not meet the constraint conditions, the prompting information is displayed on the UI interface to prompt the error reason to a user, so that the user can modify the functional parameters, the connection lines and the like of the nodes according to the prompting information to further modify the graphical deep learning model.
Step 305: the electronic apparatus generates an intermediate format file based on node information and connection relationships of the plurality of nodes in response to an operation of the function control 214 of the function selection area 101b in the UI interface 11 by the user. For example, the operation may be a click operation or the like, and the intermediate format file may be an intermediate format file as shown in fig. 3.
In some embodiments, all node information of the plurality of nodes may be included in the intermediate format file.
Step 306: in response to the operation of the user on the function control 216 of the function selection area 101b in the UI interface 11, the electronic device converts the intermediate format file into a model building code, trains the model building code to obtain a corresponding training code, and stores the training code. For example, the format of the training code is protobuf format.
The process of training the model creation code into the training code is not described in detail in the embodiments of the present application, and reference may be made to specific implementations in the related art.
In some embodiments, the electronic device may further save the intermediate format file after 305 or step 306, and cancel displaying the plurality of nodes and the connecting lines between the plurality of nodes in the graphic editing area 101 c.
Specifically, the model building code generated in step 306 is a code corresponding to a default deep learning frame or a code corresponding to a deep learning frame selected by the user. Specifically, the model building code may be composed of a code corresponding to a default deep learning frame for each node in the intermediate format file, or a code corresponding to a deep learning frame selected by the user.
It can be understood that the difference from direct code writing by users is that strict constraint check is performed on model building codes obtained by converting the intermediate format file generated based on the constraint condition, so that the legality and validity of the model codes are ensured.
Therefore, in the embodiment of the application, the intermediate format file is adopted, the format is irrelevant to the specific depth model, and the model can be dragged and the layout can be changed in the UI; the deep learning model may be edited. By utilizing the intermediate format, the development of a graphical deep learning model is realized, one-time development is supported, and the method is suitable for various deep learning frames.
Example 2:
according to some embodiments of the present application, the processing method of the deep learning model provided by the present application may open an already saved intermediate format file through the above application layout designer to view a graph of the deep learning model, or modify the graph of the deep learning model to modify the deep learning model.
Specifically, as shown in fig. 4, a method flow diagram of another processing method of a deep learning model provided in the embodiment of the present application is shown, where the method includes the following steps:
step 301 to step 305, wherein the detailed description of step 301 to step 305 may refer to the description in the above embodiment 1, and is not repeated here.
Step 307: the electronic device resumes displaying the intermediate format file as a plurality of nodes and connecting lines between the plurality of nodes, i.e., as a graphical deep learning model, in response to an operation of the function control 215 of the function selection area 101b in the UI interface 11 by the user.
Specifically, the electronic device has saved the intermediate format file before 307, and cancels displaying the plurality of nodes and the connecting lines between the plurality of nodes in the graphic editing area 101 c.
It is understood that, after step 307, the user may also trigger the electronic device to modify the graph of the deep learning model again in the UI interface 11, for example, perform operations such as adding a node or deleting a node or modifying node information of an added node, and the detailed description of these operations may refer to the related descriptions in step 302-step 305 in the foregoing embodiment, and is not described herein again. Further, after the user modifies the graphical deep learning model, a new intermediate format file may be saved again, generated as corresponding model creation code to train the code, and so on. Therefore, the user can conveniently modify the created deep learning model while looking up the structure of the deep learning model through the graphical deep learning model.
Example 3:
in other embodiments, as shown in fig. 5 in conjunction with fig. 2, the UI interface 11 further includes a control 213 for enabling a user to customize one or more extension metanodes, that is, to add one or more nodes in the UI interface 102 of the application orchestration designer, and to enable the user to trigger adding nodes to the electronic device.
Specifically, another processing method of the deep learning model provided in this embodiment of the application may further include the following step 308 before step 302, step 303, or step 304 in the steps in embodiment 1 or embodiment 2.
As an example, with reference to fig. 3, as shown in fig. 6, a method flow diagram of another processing method of a deep learning model provided in an embodiment of the present application is shown, where the method includes the following steps:
step 301, wherein the detailed description of step 301 may refer to the description in the above embodiment 1, and is not repeated here.
Step 308: the electronic device, in response to user operation of the control 213 of the node selection area 101a in the interface 10, newly adds one or more extension metanodes in the node selection area 101 a. For example, the operation may be a click operation on the control 213, or the like.
Step 302 to step 306, wherein the detailed description of step 302 to step 306 may refer to the description in the above embodiment 1, and is not repeated here.
As an example, in response to the operation in step 308, the electronic device may pop up an additional region on the UI interface 11, such as the graphic editing region 101c, the additional region being used to support the user to input various information of the additional node. For example, the following information of the newly added node: 1, name of node template; 2, the shape of the pattern; 3, the attribute of the graph; 4, the type of operator; 5, parameters of operators and parameter types; 6, input and output limitation; and 7, codes corresponding to different deep learning frames. Of course, in other embodiments, the user may also operate the UI interface 11 to modify or delete one or more nodes that have been added.
For example, in the embodiment of the present application, a newly added node "meta _ node" (as shown in fig. 5, an extended meta node 1) defined by a user includes the following information: the node name "is" conv2d "; the label "is" conv2d ", i.e. the name displayed in the UI interface 11 may be conv2 d; the color "included in the image parameter" graph _ para "is RED", and the { "height":200, "width":100} in the size of the frame "box" indicates that the size of the frame in which the current node is located is 200 high and 100 wide; the operator type "operator _ type" is the activation operator "conv 2 d"; the operator parameter "operator _ params" has a parameter "dilatins" and "pad" as an initial value "int" indicating an extended edge (or called completion) of the corresponding operator, and the parameter "kernel _ size" as an initial value "int" indicating the convolution kernel size of the corresponding operator. The model creation code "gene _ code" corresponds to a code "conv 2(in _ channels, out _ channels," kernel _ size, "stride ═ 1, pad _ mode ═ same …) in the" mindspore "format, and a code" conv2d (input, filters, strings, padding, data _ format ═ NHWC', contrast …) in the "tensoflow" format. Wherein, input represents an input data body to be convoluted, and the requirement is one Tensor; filter represents a convolution kernel, and the requirement is also a Tensor; crosses indicates the convolution step, padding indicates a string (string), taking "SAME" indicates that the boundary is filled with 0, and taking "VALID" indicates that the boundary is not filled. data _ format indicates a "optional" type string, and takes the value "NHWC" or "NCHW", where "NHWC" indicates an error.
In addition, in some embodiments, the node extension mechanism in the present application is controlled by the authority, and an authorized user can add a new node in the UI interface 11. For example, after the user clicks the control 213, the electronic device may start user identity detection, such as fingerprint detection, face detection, or character password detection, to detect the user with the right. The authentication information of the user with the authority, such as a fingerprint, a face image or a character, can be preset and stored in the electronic equipment, when the user identity is detected, the user with the authority is determined by judging that the fingerprint, the face image or the character of the user collected in real time is the same as the authentication information prestored in the electronic equipment, and the rightful user does not have the authority.
Therefore, the embodiment of the application provides an extension mechanism of the node, and the problem of expansibility of a deep learning framework operator is further solved. When an operator is newly added, the operator can be supported as long as the template file corresponding to the operator is added, one-time development is realized, and multiple deep learning frameworks are supported.
Fig. 7 shows a schematic structural diagram of the electronic device 100. The electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a Universal Serial Bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a key 190, a motor 191, an indicator 192, a camera 193, a display screen 194, a Subscriber Identification Module (SIM) card interface 195, and the like. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.
It is to be understood that the illustrated structure of the embodiment of the present application does not specifically limit the electronic device 100. In other embodiments of the present application, electronic device 100 may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
The processor 110 may include one or more processing units, wherein the different processing units may be stand-alone devices or may be integrated in one or more processors.
A memory may also be provided in processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 110, thereby increasing the efficiency of the system.
It should be understood that the interface connection relationship between the modules illustrated in the embodiments of the present application is only an illustration, and does not limit the structure of the electronic device 100. In other embodiments of the present application, the electronic device 100 may also adopt different interface connection manners or a combination of multiple interface connection manners in the above embodiments.
The wireless communication function of the electronic device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, a modem processor, a baseband processor, and the like. For example, the electronic apparatus 100 may establish wireless communication with the management apparatus 200 through a wireless communication function.
The wireless communication module 160 may provide a solution for wireless communication applied to the electronic device 100, including Wireless Local Area Networks (WLANs) (e.g., wireless fidelity (Wi-Fi) networks), bluetooth (bluetooth, BT), Global Navigation Satellite System (GNSS), Frequency Modulation (FM), Near Field Communication (NFC), Infrared (IR), and the like.
The electronic device 100 implements display functions, such as displaying the driver management interface in the above example, through the GPU, the display screen 194, and the application processor. The GPU is a microprocessor for image processing, and is connected to the display screen 194 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. The processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
The electronic device 100 may implement a shooting function through the ISP, the camera 193, the video codec, the GPU, the display 194, the application processor, and the like, for example, the display 194 is used to display a UI interface of the deep learning model of the user through-piece imaging in the application layout designer.
The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to extend the memory capability of the electronic device 100. The external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, music, video, and intermediate format files of the deep learning model or files of the model creation code and training code are stored in the external memory card.
The internal memory 121 may be used to store computer-executable program code, which includes instructions. The processor 110 executes instructions stored in the internal memory 121 to perform various functional applications and data processing of the electronic device 100, such as creating or modifying a graphical deep learning model through the UI interface, saving the deep learning model in an intermediate format, and converting the intermediate format file into model creation code in one or more formats or restoring the model to a graph. The internal memory 121 may include a program storage area and a data storage area. The program storage area may store an operating system, an application program (such as a sound playing function, an image playing function, and the like) required by at least one function, and the like.
The software system of the electronic device 100 may employ a layered architecture, an event-driven architecture, a micro-core architecture, a micro-service architecture, or a cloud architecture. The embodiment of the present application takes an Android system with a layered architecture as an example, and exemplarily illustrates a software structure of the electronic device 100.
Fig. 8 is a block diagram of a software configuration of an electronic device according to an embodiment of the present application. The layered architecture divides the software into several layers, each layer having a clear role and division of labor. The layers communicate with each other through a software interface. In some embodiments, the Android system is divided into four layers, an application layer, an application framework layer, an Android runtime (Android runtime) and system library, and a kernel layer from top to bottom. The application layer may include a series of application packages.
As shown in fig. 8, the application package may include applications such as camera, gallery, calendar, phone call, map, navigation, WLAN, bluetooth, music, video, short message, etc.
The application framework layer provides an Application Programming Interface (API) and a programming framework for the application program of the application layer. The application framework layer includes a number of predefined functions.
As shown in FIG. 8, the application framework layers may include a window manager, content provider, view system, phone manager, resource manager, notification manager, and the like.
The window manager is used for managing window programs. The window manager can obtain the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.
The content provider is used to store and retrieve data and make it accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.
The view system includes visual controls such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, the display interface including the sms notification icon may include a view displaying text and a view displaying pictures, such as a UI interface displaying a deep learning model of user drill-through graphics in the application layout designer.
The phone manager is used to provide communication functions for the electronic device 100. Such as management of call status (including on, off, etc.).
The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and the like.
The notification manager enables the application to display notification information in the status bar, can be used to convey notification-type messages, can disappear automatically after a short dwell, and does not require user interaction. Such as a notification manager used to inform download completion, message alerts, etc. The notification manager may also be a notification that appears in the form of a chart or scroll bar text at the top status bar of the system, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, prompting text information in the status bar, sounding a prompt tone, vibrating the electronic device, flashing an indicator light, etc.
The Android Runtime comprises a core library and a virtual machine. The Android runtime is responsible for scheduling and managing an Android system.
The core library comprises two parts: one part is a function which needs to be called by java language, and the other part is a core library of android.
The application layer and the application framework layer run in a virtual machine. And executing java files of the application program layer and the application program framework layer into a binary file by the virtual machine. The virtual machine is used for performing the functions of object life cycle management, stack management, thread management, safety and exception management, garbage collection and the like.
The system library may include a plurality of functional modules. For example: surface managers (surface managers), Media Libraries (Media Libraries), three-dimensional graphics processing Libraries (e.g., OpenGL ES), 2D graphics engines (e.g., SGL), and the like.
The surface manager is used to manage the display subsystem and provide a fusion of the 2D and 3D layers for multiple applications.
The media library supports a variety of commonly used audio, video format playback and recording, and still image files, among others. The media library may support a variety of audio-video encoding formats, such as MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, and the like.
The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like.
The 2D graphics engine is a drawing engine for 2D drawing.
The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.
Embodiments of the mechanisms disclosed herein may be implemented in hardware, software, firmware, or a combination of these implementations. Embodiments of the application may be implemented as computer programs or program code executing on programmable systems comprising at least one processor, a storage system (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device.
Program code may be applied to input instructions to perform the functions described herein and generate output information. The output information may be applied to one or more output devices in a known manner. For purposes of this application, a processing system includes any system having a processor such as, for example, a Digital Signal Processor (DSP), a microcontroller, an Application Specific Integrated Circuit (ASIC), or a microprocessor.
The program code may be implemented in a high level procedural or object oriented programming language to communicate with a processing system. The program code can also be implemented in assembly or machine language, if desired. Indeed, the mechanisms described in this application are not limited in scope to any particular programming language. In any case, the language may be a compiled or interpreted language.
In some cases, the disclosed embodiments may be implemented in hardware, firmware, software, or any combination thereof. The disclosed embodiments may also be implemented as instructions carried by or stored on one or more transitory or non-transitory machine-readable (e.g., computer-readable) storage media, which may be read and executed by one or more processors. For example, the instructions may be distributed via a network or via other computer readable media. Thus, a machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer), including, but not limited to, floppy diskettes, optical disks, read-only memories (CD-ROMs), magneto-optical disks, read-only memories (ROMs), Random Access Memories (RAMs), erasable programmable read-only memories (EPROMs), electrically erasable programmable read-only memories (EEPROMs), magnetic or optical cards, flash memory, or a tangible machine-readable memory for transmitting information (e.g., carrier waves, infrared digital signals, etc.) using the internet in an electrical, optical, acoustical or other form of propagated signal. Thus, a machine-readable medium includes any type of machine-readable medium suitable for storing or transmitting electronic instructions or information in a form readable by a machine (e.g., a computer).
In the drawings, some features of the structures or methods may be shown in a particular arrangement and/or order. However, it is to be understood that such specific arrangement and/or ordering may not be required. Rather, in some embodiments, the features may be arranged in a manner and/or order different from that shown in the illustrative figures. In addition, the inclusion of a structural or methodical feature in a particular figure is not meant to imply that such feature is required in all embodiments, and in some embodiments, may not be included or may be combined with other features.
It should be noted that, in the embodiments of the apparatuses in the present application, each unit/module is a logical unit/module, and physically, one logical unit/module may be one physical unit/module, or may be a part of one physical unit/module, and may also be implemented by a combination of multiple physical units/modules, where the physical implementation manner of the logical unit/module itself is not the most important, and the combination of the functions implemented by the logical unit/module is the key to solve the technical problem provided by the present application. Furthermore, in order to highlight the innovative part of the present application, the above-mentioned device embodiments of the present application do not introduce units/modules which are not so closely related to solve the technical problems presented in the present application, which does not indicate that no other units/modules exist in the above-mentioned device embodiments.
It is noted that, in the examples and descriptions of this patent, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
While the present application has been shown and described with reference to certain preferred embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present application.

Claims (16)

1. A processing method of a deep learning model is applied to electronic equipment, and is characterized by comprising the following steps:
receiving a first operation of selecting at least one first control from a displayed model editing interface by a user;
responding to the first operation, displaying at least one node and a connecting line between the at least one node on the model editing interface, wherein the at least one node corresponds to the at least one first control in a one-to-one mode, each node has node information, and the connecting line of the at least one node is used for indicating the connecting relation between the at least one node;
and generating a first target file based on the node information of the at least one node and the connection relation between the at least one node, wherein the first target file comprises the structural information and the graphical information of the first deep learning model.
2. The method of claim 1, wherein each node information comprises at least one of: the functional parameters, the graphic information, the input and output limitation information of the corresponding operators and codes corresponding to at least one deep learning frame;
wherein each deep learning framework is used for generating model building codes in a code format corresponding to the first deep learning model.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
and responding to a second operation of the user, generating a first model creation code for the first target file based on a target deep learning frame, wherein the target deep learning frame is one of at least one deep learning frame, and each deep learning frame is used for generating a model construction code in a code format corresponding to the first deep learning model.
4. The method according to any one of claims 1 to 3, further comprising:
responding to a third operation of a user, saving the first target file, and canceling the display of the at least one node and a connecting line between the at least one node on the model editing interface;
and responding to a fourth operation of the user, and displaying the at least one node and the connecting line between the at least one node on the model editing interface again according to the first target file.
5. The method of claim 4, further comprising:
and responding to a fifth operation of the user, updating the at least one node and the connecting line of the at least one node on the model editing interface, and updating the first target file.
6. The method according to any one of claims 1 to 5, wherein the displaying at least one node and a connecting line between the at least one node on the model editing interface in response to the first operation comprises:
in response to the first operation, displaying at least one node and a connecting line between the at least one node on the model editing interface under the condition that a constraint condition is met; under the condition that the constraint condition is not met, displaying target prompt information on the model editing interface;
wherein the constraints comprise at least one of: the sub-constraint condition of the functional parameter of the corresponding operator of the node, the sub-constraint condition of the graphic information of the node, and the sub-constraint condition of the input and the output of the node; the target prompt message is used for prompting node information of the at least one node and/or errors of connection relations among the at least one node.
7. The method according to any one of claims 1 to 6, further comprising:
responding to a sixth operation of the user, adding at least one extension control in the model editing interface, wherein each extension control corresponds to one node, and each extension control is used for responding to the operation of the user and triggering the corresponding node to be displayed on the model editing interface.
8. A device for processing a deep learning model, comprising:
the receiving module is used for receiving a first operation that a user selects at least one first control from the displayed model editing interface;
a display module, configured to display, in response to the first operation received by the receiving module, at least one node and a connection line between the at least one node on the model editing interface, where the at least one node corresponds to the at least one first control one to one, each node has node information, and the connection line of the at least one node is used to indicate a connection relationship between the at least one node;
and the first generation module is used for generating a first target file based on the node information of the at least one node and the connection relation between the at least one node, wherein the first target file comprises the structural information and the graphical information of the first deep learning model.
9. The apparatus of claim 8, wherein each node information comprises at least one of the following information: the functional parameters, the graphic information, the input and output limitation information of the corresponding operators and codes corresponding to at least one deep learning frame;
wherein each deep learning framework is used for generating model building codes in a code format corresponding to the first deep learning model.
10. The apparatus of claim 8 or 9, further comprising:
and the second generation module is used for generating a first model creation code for the first target file generated by the first generation module based on a target deep learning frame in response to a second operation of the user, wherein the target deep learning frame is one of at least one deep learning frame, and each deep learning frame is used for generating a model construction code in a code format corresponding to the first deep learning model.
11. The apparatus of any one of claims 8 to 10, further comprising:
the saving module is used for responding to a third operation of a user, saving the first target file generated by the first generating module, and canceling the display of the at least one node and a connecting line between the at least one node on the model editing interface;
and the display module is further used for responding to a fourth operation of the user and displaying the at least one node and the connecting line between the at least one node on the model editing interface again according to the first target file.
12. The apparatus of claim 11, further comprising:
and the updating module is used for responding to a fifth operation of a user, updating the at least one node and the connecting line of the at least one node on the model editing interface, and updating the first target file.
13. The apparatus according to any one of claims 8 to 12,
the first generation module is specifically configured to, in response to the first operation, display at least one node and a connection line between the at least one node on the model editing interface under the condition that a constraint condition is satisfied; under the condition that the constraint condition is not met, displaying target prompt information on the model editing interface;
wherein the constraints comprise at least one of: the sub-constraint condition of the functional parameter of the corresponding operator of the node, the sub-constraint condition of the graphic information of the node, and the sub-constraint condition of the input and the output of the node; the target prompt message is used for prompting node information of the at least one node and/or errors of connection relations among the at least one node.
14. The apparatus of any one of claims 8 to 13, further comprising:
and the adding module is used for responding to the sixth operation of the user, adding at least one extension control in the model editing interface, wherein each extension control corresponds to one node, and each extension control is used for responding to the operation of the user and triggering the corresponding node to be displayed on the model editing interface.
15. A readable medium having stored thereon instructions which, when executed on an electronic device, cause the electronic device to perform the method of processing a deep learning model of any one of claims 1 to 7.
16. An electronic device, comprising: a memory for storing instructions for execution by one or more processors of an electronic device, and a processor, being one of the processors of the electronic device, for executing the instructions stored in the memory to implement the processing method of the deep learning model of any one of claims 1 to 7.
CN202110157121.7A 2021-02-04 2021-02-04 Deep learning model processing method and device and electronic equipment Pending CN114861860A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110157121.7A CN114861860A (en) 2021-02-04 2021-02-04 Deep learning model processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110157121.7A CN114861860A (en) 2021-02-04 2021-02-04 Deep learning model processing method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN114861860A true CN114861860A (en) 2022-08-05

Family

ID=82623078

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110157121.7A Pending CN114861860A (en) 2021-02-04 2021-02-04 Deep learning model processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN114861860A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992241A (en) * 2023-09-26 2023-11-03 深圳前海环融联易信息科技服务有限公司 Model generation method and device, storage medium and computer equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992241A (en) * 2023-09-26 2023-11-03 深圳前海环融联易信息科技服务有限公司 Model generation method and device, storage medium and computer equipment
CN116992241B (en) * 2023-09-26 2024-01-19 深圳前海环融联易信息科技服务有限公司 Model generation method and device, storage medium and computer equipment

Similar Documents

Publication Publication Date Title
US20230108680A1 (en) Screen Projection Method and Terminal Device
JP4812337B2 (en) Method and apparatus for generating a form using a form type
KR102451375B1 (en) Automatic layout of visualizations based on contract maps
US9842123B2 (en) Dynamic, parameterized image resource selection
CN110708202B (en) Configuration method, device, equipment and storage medium of plug-in node
CN111343339B (en) Mobile terminal and image display method thereof
CN112181576A (en) Form processing method, device, terminal and storage medium
JP2008158989A (en) Gui creation device and gui creation method
CN112990188A (en) Text recognition method and device
CN112329785A (en) Image management method, device, terminal and storage medium
CN111191176B (en) Website content updating method, device, terminal and storage medium
US9251762B2 (en) Runtime transformation of images to match a user interface theme
CN114861860A (en) Deep learning model processing method and device and electronic equipment
KR20180111242A (en) Electronic device and method for providing colorable content
CN112036492B (en) Sample set processing method, device, equipment and storage medium
CN108287707B (en) JSX file generation method and device, storage medium and computer equipment
CN113408260A (en) Project engineering generation method and device, computer equipment and storage medium
WO2023103918A1 (en) Speech control method and apparatus, and electronic device and storage medium
CN105453116A (en) Transforming visualized data through visual analytics based on interactivity
CN117112012A (en) Application configuration generation method, device, equipment and medium
US10691336B2 (en) File-based custom configuration of dynamic keyboards
CN116775174A (en) Processing method, device, equipment and medium based on user interface frame
US20140002470A1 (en) Generic Media Covers
CN114546113A (en) Menu operation method and device, storage medium and electronic equipment
CN112015416A (en) Verification method and device for developing webpage, electronic equipment and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination