CN111222637A - Neural network model deployment method and device, electronic equipment and storage medium - Google Patents
Neural network model deployment method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN111222637A CN111222637A CN202010052349.5A CN202010052349A CN111222637A CN 111222637 A CN111222637 A CN 111222637A CN 202010052349 A CN202010052349 A CN 202010052349A CN 111222637 A CN111222637 A CN 111222637A
- Authority
- CN
- China
- Prior art keywords
- neural network
- network model
- operator
- data
- basic operator
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003062 neural network model Methods 0.000 title claims abstract description 209
- 238000000034 method Methods 0.000 title claims abstract description 84
- 238000013500 data storage Methods 0.000 claims abstract description 8
- 230000006870 function Effects 0.000 claims description 59
- 238000012545 processing Methods 0.000 claims description 30
- 230000008569 process Effects 0.000 claims description 27
- 238000006243 chemical reaction Methods 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 10
- 238000001514 detection method Methods 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 7
- 230000009466 transformation Effects 0.000 claims description 4
- 238000013528 artificial neural network Methods 0.000 description 26
- 238000010586 diagram Methods 0.000 description 19
- 238000004891 communication Methods 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 7
- 230000003287 optical effect Effects 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 4
- 238000007726 management method Methods 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 230000006378 damage Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 238000007667 floating Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000005648 named reaction Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Neurology (AREA)
- Image Analysis (AREA)
Abstract
The present disclosure relates to a neural network model deployment method and apparatus, an electronic device, and a storage medium, the method including: loading a neural network model; configuring initialization information for the neural network model, wherein the initialization information comprises resources allocated for operation and data storage of the neural network model; and converting the neural network model into a preset programming language form based on the network structure information of the neural network model and the initialization information. The embodiment of the disclosure can realize the expansion capability of the application platform and improve the applicability of the neural network model.
Description
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a neural network model deployment method and apparatus, an electronic device, and a storage medium.
Background
In the specific application of the neural network such as deep learning, engineering technology cooperation in various aspects is needed. The deployment scheme of the model relates to how the model is used by the program, and plays a very key role in the whole neural network learning application scene.
Currently, the training and deployment of neural networks are different from each other, and the deployment of neural networks requires more efficient performance.
Disclosure of Invention
The present disclosure provides a technical solution for neural network model deployment.
According to an aspect of the present disclosure, there is provided a neural network model deployment method, including:
loading a neural network model;
configuring initialization information for the neural network model, wherein the initialization information comprises resources allocated for operation and data storage of the neural network model;
and converting the neural network model into a preset programming language form based on the network structure information of the neural network model and the initialization information.
In some possible embodiments, the converting the neural network model into a preset programming language form based on the network structure information of the neural network model and the initialization information includes:
determining a basic operator of network layer matching in the neural network model, wherein the basic operator is represented by a preset programming language;
and converting the neural network model into the form of the preset programming language based on the network structure information, the initialization information and the basic operator of the neural network model.
In some possible embodiments, the determining a base operator of network layer matching within the neural network model includes:
determining, based on a type of the network layer within the neural network model, at least one first base operator corresponding to the type;
and selecting a first basic operator meeting preset requirements from the at least one first basic operator, and determining the selected first basic operator as the basic operator matched with the network layer.
In some possible embodiments, the selecting, from the at least one first basic operator, a first basic operator meeting a preset requirement includes at least one of:
selecting a first basic operator meeting the preset requirement from the first basic operators according to the parameters of the network layer of the neural network model;
and operating the neural network model, and selecting the first basic operator meeting the preset requirement based on the operating parameters of the at least one first basic operator.
In some possible embodiments, the preset requirements include: run time is minimal and/or memory is minimal.
In some possible embodiments, the converting the neural network model into the form of the preset programming language based on the network structure information of the neural network model, the initialization information, and the base operator includes:
running the neural network model by using the resources included in the initialization information;
determining first class data and second class data in the process of operating the neural network model, wherein the first class data comprises the structure and the initial weight of each network layer in the neural network model, and the second class data comprises the optimized weight of each network layer;
configuring the first type of data into global read-only data, and constructing a first structure body representing the second type of data, wherein the first structure body comprises an input pointer and an output pointer of the second type of data corresponding to the basic operator;
compiling to form an inference function with the first structure as a parameter, and writing a calling method of the basic operator into the inference function;
and the input interface and the output interface are used for generating the inference function.
In some possible embodiments, the method further comprises:
acquiring input data;
and processing the input data by utilizing the neural network model in the preset programming language form to obtain a prediction result.
In some possible embodiments, the neural network model includes at least one of a feature extraction neural network model, a classification neural network model, and an object detection neural network model.
According to a second aspect of the present disclosure, there is provided a neural network model deployment apparatus, comprising:
the loading module is used for loading the neural network model;
the configuration module is used for configuring initialization information for the neural network model, wherein the initialization information comprises resources allocated for operation and data storage of the neural network model;
and the conversion module is used for converting the neural network model into a preset programming language form based on the network structure information of the neural network model and the initialization information.
In some possible embodiments, the conversion module is further configured to determine a base operator of network layer matching within the neural network model, the base operator being represented by a preset programming language; and
and converting the neural network model into the form of the preset programming language based on the network structure information, the initialization information and the basic operator of the neural network model.
In some possible embodiments, the conversion module is further configured to determine, based on a type of the network layer within the neural network model, at least one first base operator corresponding to the type; and
and selecting a first basic operator meeting preset requirements from the at least one first basic operator, and determining the selected first basic operator as the basic operator matched with the network layer.
In some possible embodiments, the conversion module is further configured to select a first base operator meeting a preset requirement from the at least one first base operator by at least one of:
selecting a first basic operator meeting the preset requirement from the first basic operators according to the parameters of the network layer of the neural network model;
and operating the neural network model, and selecting the first basic operator meeting the preset requirement based on the operating parameters of the at least one first basic operator.
In some possible embodiments, the preset requirements include: run time is minimal and/or memory is minimal.
In some possible embodiments, the conversion module is further configured to run the neural network model using a resource included in the initialization information;
determining first class data and second class data in the process of operating the neural network model, wherein the first class data comprises the structure and the initial weight of each network layer in the neural network model, and the second class data comprises the optimized weight of each network layer;
configuring the first type of data into global read-only data, and constructing a first structure body representing the second type of data, wherein the first structure body comprises an input pointer and an output pointer of the second type of data corresponding to the basic operator;
compiling to form an inference function with the first structure as a parameter, and writing a calling method of the basic operator into the inference function;
and the input interface and the output interface are used for generating the inference function.
In some possible embodiments, the apparatus further comprises:
the acquisition module is used for acquiring input data;
and the prediction module is used for processing the input data by utilizing the neural network model in the preset programming language form to obtain a prediction result.
In some possible embodiments, the neural network model includes at least one of a feature extraction neural network model, a classification neural network model, and an object detection neural network model.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to invoke the memory-stored instructions to perform the method of any of the first aspects.
According to a fourth aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method of any one of the first aspects.
In the embodiment of the present disclosure, in the process of deploying the neural network model to the application platform, the neural network model may be loaded first, and based on the initialization information configured for the neural network, the neural network model is converted into the preset programming language form, that is, a code program in the preset programming language form is generated. By the configuration of the embodiment, the neural network model can be applied to a platform which does not support neural network file reading by generating the codes corresponding to the neural network model, so that the expandability and the application performance of the platform are improved, the codes are converted without extra time, the corresponding codes can be generated in the process of configuring initialization information, and the network deployment time is saved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.
FIG. 1 shows a flow diagram of a neural network model deployment method in accordance with an embodiment of the present disclosure;
FIG. 2 illustrates a flow diagram for converting a neural network model to a preset programming language form in accordance with an embodiment of the present disclosure;
fig. 3 shows a flowchart of step S31 according to an embodiment of the present disclosure;
fig. 4 shows a flowchart of step S32 according to an embodiment of the present disclosure;
FIG. 5 illustrates a block diagram of a neural network model deployment device, in accordance with an embodiment of the present disclosure;
FIG. 6 shows a block diagram of an electronic device in accordance with an embodiment of the disclosure;
fig. 7 shows a block diagram of another electronic device in accordance with an embodiment of the disclosure.
Detailed Description
Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
The main body of the neural network model deployment method provided by the embodiment of the present disclosure may be any electronic device, for example, the neural network model deployment method may be executed by a terminal device or a server, where the terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, and the like. The server may be a local server or a cloud server. In some possible implementations, the neural network model deployment method may be implemented by a processor invoking computer-readable instructions stored in a memory.
Fig. 1 shows a flowchart of a neural network model deployment method according to an embodiment of the present disclosure, and as shown in fig. 1, the neural network model deployment method includes:
s10: loading a neural network model;
in some possible implementations, the embodiments of the present disclosure may implement deploying a neural network model in an application platform, where the application platform may be an application platform of a terminal device, and may also be an application platform in a server, which is not specifically limited by the present disclosure. The following description is provided to implement the embodiments of the present disclosure by using an electronic device, but the present disclosure is not limited thereto.
In some possible embodiments, the neural network model may be a neural network model that implements any function, and may include at least one of a feature extraction neural network model that implements feature information extraction of input data, a target detection neural network model that implements object detection, a neural network model that implements target segmentation, a neural network model that implements target classification, and a neural network model that implements natural language processing, for example. The foregoing is illustrative only and is not to be construed as a specific limitation of the present disclosure.
In some possible embodiments, in the case of deploying a corresponding neural network model in the electronic device, the neural network model may be loaded first, i.e. a file of the neural network model is loaded into a program of the application platform. The neural network model may include a first file describing a network structure of the neural network, and a second file storing parameter information such as weights used in the neural network. For example, the neural network model in the embodiments of the present disclosure may be in the califfemod format. The neural network model may include two files, where the first file is a lenet. The weight of the complex. The foregoing is merely exemplary and the disclosure is not limited thereto.
S20: configuring initialization information for the neural network model, wherein the initialization information comprises resources allocated for operation and data storage of the neural network model;
in some possible embodiments, the initialization information may be configured for the neural network model during or after loading the neural network model. The configuring initialization information may include allocating corresponding resources, such as at least one of a configuration memory and a video memory, for the operation of the neural network model and the data storage. In the process of running the neural network model, a plurality of data structures are required to support, each data structure corresponds to a resource for calculation, such as a memory, a video memory and the like, and can be used for storing input and output data, network weight, network parameters and the like.
The embodiment of the disclosure can perform the operation process of the neural network by inference based on the network structure of the neural network, calculate the resources required by the execution process of the neural network model based on the data volume and the occupied memory in the inference process, and perform the initialized configuration for the neural network model.
S30: and converting the neural network model into a preset programming language form based on the network structure information and the initialization information of the neural network model.
The embodiment of the disclosure may convert the neural network model into a form of a preset programming language in a process of configuring the initialization information or in a case that the configuration of the initialization information is completed. For example, the predetermined programming language of the embodiment of the present disclosure may include at least one of a C language and a C + + language. The neural network model is converted into the preset coding language, so that the neural network model is suitable for different application platforms, and the applicability of the neural network is improved. In addition, when steps S20 and S30 are executed simultaneously, the conversion of the programming language can be executed without wasting time, thereby saving time and cost.
In the embodiment of the present disclosure, converting the neural network model into the preset programming language form may include using the preset programming language to implement the function of each network layer in the neural network model, using the preset programming language to generate and implement the call of the function corresponding to each network layer, thereby implementing the function of the neural network, and using the preset programming language to configure the input/output interface.
Through the configuration, the neural network model can be converted into a preset programming language form, and the method can be suitable for various application platforms. In the case of converting the neural network model into the preset programming language form, in the application platform, the processing of the input data may be performed by using the neural network model in the preset programming language form, where the input data may be determined according to the function of the neural network, for example, the input data may include at least one of voice, text, image, video, and the like. And executing the processing of the input data through a neural network model in a preset programming language form to obtain a prediction result.
For example, the neural network model in the embodiment of the present disclosure may be a neural network mnist for recognizing handwritten numbers, and first, the neural network model may be loaded into a program, for example, a thread. Where the prediction result may include 10 floating point numbers, the value of the first floating point number (counting from 0 in programming) represents the probability that the number is a few. If we have a picture of "0" in it, the program will return (1,0,0,0,0,0,0, 0) and a picture of "3" in it, the program will return (0,0,0,1,0,0,0,0,0,0), and the corresponding position will be 1, so we know the value in the inputted picture.
The following describes in detail a specific process of the embodiments of the present disclosure with reference to the drawings. In the case of loading the file of the neural network model into the program, the disclosed embodiment may allocate corresponding resources for the operation of the neural network, and at the same time, it is necessary to generate a code, such as a C code, related to the execution process of the neural network model, that is, convert the neural network model into a preset programming language form.
FIG. 2 illustrates a flow diagram for converting a neural network model to a preset programming language form in accordance with an embodiment of the present disclosure. Wherein based on the network structure information and initialization information of the neural network model, the neural network model is converted into a preset programming language form, including:
s31: determining a basic operator matched with a network layer in the neural network model, wherein the basic operator is represented by a preset programming language;
s32: and converting the neural network model into a preset programming language form based on the network structure information, the initialization information and the basic operator of the neural network model.
In some possible embodiments, in generating the preset programming language for the neural network model, matching base operators may be selected for each network layer within the network structure in the neural network model.
For example, the basic operators may include pool operators powing _3x3_ pad1x1_ method0 for performing a pool operation, Convolution operators constraint _3x3_ pad1x1_ method0 for performing a Convolution operation, and Reduce _ mean _ method0), and may further include full join operators, classification operators, and the like.
Under the condition of loading the neural network model, a basic operator matched with each network layer in the network structure can be obtained according to the type of each network layer, wherein the type and the function of each network layer can correspond to each other. For example, the network layer may include convolutional layers, pooling layers, fully-connected layers, and the like. In the embodiment of the present disclosure, the basic operators implementing the same function may include at least one, and the code programs in different basic operators are different.
Fig. 3 shows a flowchart of step S31 according to an embodiment of the present disclosure. Wherein determining a base operator of network layer matching within the neural network model comprises:
s311: determining at least one first base operator corresponding to a type based on the type of a network layer in the neural network model;
s312: and selecting a first basic operator meeting preset requirements from at least one first basic operator, and determining the selected first basic operator as the basic operator matched with the network layer.
In some possible embodiments, in the case of determining the type (function) of the network layer in the neural network, a base operator that has the same function as the network layer of the neural network may be randomly selected from the base operator library as a matching base operator.
In some possible embodiments, in the case that there are a plurality of basic operators for implementing the functions of the network layer, one basic operator may be randomly selected from the basic operators as the matched basic operator, or a basic operator meeting preset requirements may be selected from the basic operators as the matched basic operator. The disclosed embodiments refer to the base operator matched to the type of the network layer as the first base operator.
The selecting of the first basic operator meeting the preset requirement from the plurality of first basic operators matched in the network layer may include at least one of the following manners:
A) selecting a first basic operator meeting preset requirements from the first basic operators according to parameters of a network layer of the neural network model;
in some possible embodiments, simulation operation may be directly performed through each first basic operator according to parameters used by the network layer to obtain an occupancy parameter of the running resource, where the occupancy parameter may include running time and occupied memory. That is, according to the parameters used by the network layer, the embodiment of the present disclosure may directly perform the simulation operation through each first basic operator to obtain the occupancy parameters of the operating resources, where the occupancy parameters may include the operating time and the occupied memory. Further, the first basic operators meeting the preset requirements can be determined according to the occupation quantity parameters of the operating resources of the first basic operators. According to the method and the device for processing the network layer, the first basic operator with the shortest running time and/or the least occupied memory can be determined as the first basic operator meeting the preset requirement, and therefore the optimal basic operator matched with the network layer can be selected.
B) And operating the neural network model, and selecting the first basic operator meeting the preset requirement based on the operating parameters of at least one first basic operator.
In some possible embodiments, the neural network in the form of the non-converted programming language may be operated, and an operation resource occupied by the first basic operator corresponding to each network layer in the actual operation process when processing corresponding data is obtained, the same operation resource may also include an operation time and an occupied memory, and further, the first basic operator meeting the preset requirement may be determined according to the operation resource of each first basic operator. According to the method and the device for processing the network layer, the first basic operator with the shortest running time and/or the least occupied memory can be determined as the first basic operator meeting the preset requirement, and therefore the optimal basic operator matched with the network layer can be selected.
By the method, the basic operator matched with each network layer in the neural network model can be determined, and the preset programming language of each network layer is realized.
Under the condition of obtaining the basic operator matched with each network layer, a frame of a main reasoning function of the neural network model can be further generated, and calling between the basic operators corresponding to each network layer is realized.
Fig. 4 shows a flowchart of step S32 in an embodiment of the present disclosure, in which the neural network model is converted into a form of a preset programming language based on the network structure information, the initialization information, and the basic operator of the neural network model, including:
s321: running a neural network model by using the resources included in the initialization information;
in some possible embodiments, in the process of performing the transformation of the neural network of the preset programming speech, the loaded neural network model may be performed by using the configured initialization information, and the process may be performed alone or simultaneously with the determination process of the basic operator, which is not specifically limited by the present disclosure.
S322: determining first class data and second class data in the process of running the neural network model, wherein the first class data comprise structures and initial weights of all network layers in the neural network model, and the second class data comprise optimized weights of all network layers;
in some possible embodiments, the first type of data and the second type of data used or generated by the neural network model during the operation process may be predetermined, where the first type of data is global data that does not change during the operation process, such as the structure of the network layer, the size of the convolution kernel, the size of the pooling kernel, the initial weight of the network, and the like, and the second type of data is data that may be modified during the operation process, such as the weight of the optimized network, the input and output space of each operator, and the like, which are merely exemplary illustrations and are not specific limitations of the present disclosure.
S323: configuring the first type of data into global read-only data, and constructing a first structure body representing the second type of data, wherein the first structure body comprises an input pointer and an output pointer of the second type of data corresponding to a basic operator;
in some possible embodiments, each partial data described in S32 may be converted into a preset programming language representation. Wherein, the first type data which is not changed in the running process can be written as the global read-only data. For the second type of data, a first structure, such as a C structure (referred to herein as a reasoning structure), can be defined to represent information that changes during reasoning, and the first structure can contain pointers to inputs and outputs that propagate between the base operators, as well as other necessary information.
S324: compiling to form an inference function with the first structure as a parameter, and writing a calling method of the basic operator into the inference function;
in some possible embodiments, a main inference function of the neural network may also be generated, for example, generating inference code, compiling a parameter according to the syntax of the preset programming language, and generating an inference function framework describing the first structure in step S33. The calling method of the basic operator matched with each network layer can be written into the reasoning function by using a preset programming language. Wherein, the source of the function parameter of each basic operator can comprise the generated global read-only data and the first structure body.
S325: and the input interface and the output interface are used for generating the inference function.
In some possible embodiments, the goal of performing the operation of the neural network is to provide an input and then obtain an output. Therefore, a neural network model and external input and output need to be established correspondingly, and an interface for setting input and acquiring output needs to be provided correspondingly. The parameters required by the input interface and the output interface should be in the global read-only data and the first structure defined above.
In some possible embodiments, there may be multiple network layers or scenarios where network models cooperate, and the defined global read-only data and the first structure may be generally hidden and identified, providing a creation function and a destruction function of the function by name reaction model. The same process (function name reaction model function) is then performed on the set input interface and the fetch output interface of the inference function, so that multiple models can be used in the program.
The functions of the network modules of the neural network can be associated with the functions of the corresponding preset programming language in the above manner. In a specific application, a model name (e.g., C string) may be bound to the generated inference function (through a simple string structure mapping table, the structure includes creating and destroying function pointers of an inference structure instance, and setting function pointers for input and output acquisition). Thus, the upper level logic can isolate the model function from the lower level C function. That is to say, the embodiment of the present disclosure may implement that the inference function generated by each neural network model is bound with the function corresponding to the neural network model to form a mapping table representing the binding relationship, and the mapping table may further include a pointer of a function that performs creation and destruction on the global read-only data and the first structure, and a function pointer (a pointer of an input interface and a pointer of an output interface) that sets an input of the inference parameter and obtains an output. Therefore, the execution of each neural network model can be conveniently realized, and mutual interference is avoided.
In addition, in the embodiment of the present disclosure, if the platform environment of the actual application does not support the conversion of the implementation code, the conversion of the neural network model may be implemented by a server or other devices, and the converted code is loaded into the application platform. That is, the conversion of the code can be implemented in an environment similar to the application platform or in an off-line method.
In addition, if the platform of the practical application does not support the allocation of the dynamic memory, the resource management information configured in advance can be received before the neural network model is loaded or before the initialization information is configured, so that the configuration of the resource is realized.
In summary, in the process of deploying the neural network model to the application platform, the neural network model may be loaded first, and based on the initialization information of the neural network configuration, the neural network model is converted into the preset programming language form, that is, a code program in the preset programming language form is generated. Through the configuration of the embodiment, the neural network model can be applied to a platform which does not support neural network file reading by generating the code corresponding to the neural network model, so that the expandability and the application performance of the platform are increased, meanwhile, no extra time is needed for code conversion, the corresponding code can be generated in the process of configuring initialization information, the network deployment time is saved, for example, the initialization time can be reduced, and when the neural network of the embodiment of the disclosure is applied to a vehicle or other scenes with high requirements on time response (for example, when the neural network is applied to the condition that devices need to be started and closed repeatedly), it is particularly important to reduce the initialization time during starting. In addition, the embodiment of the disclosure can also solve the problem of no file IO in the application environment.
It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.
It is understood that the above-mentioned method embodiments of the present disclosure can be combined with each other to form a combined embodiment without departing from the logic of the principle, which is limited by the space, and the detailed description of the present disclosure is omitted.
In addition, the present disclosure also provides a neural network model deployment apparatus, an electronic device, a computer-readable storage medium, and a program, which can be used to implement any one of the neural network model deployment methods provided by the present disclosure, and the corresponding technical solutions and descriptions and corresponding descriptions in the methods section are referred to and are not described again.
Fig. 5 is a block diagram of a neural network model deployment device according to an embodiment of the present disclosure, and as shown in fig. 5, the neural network model deployment device may include:
a loading module 10, configured to load a neural network model;
a configuration module 20, configured to configure initialization information for the neural network model, where the initialization information includes resources allocated for operation and data storage of the neural network model;
and a conversion module 30, configured to convert the neural network model into a preset programming language form based on the network structure information and the initialization information of the neural network model.
In some possible embodiments, the conversion module is further configured to determine a base operator of the network layer matching within the neural network model, the base operator being represented by a preset programming language; and
and converting the neural network model into a preset programming language form based on the network structure information, the initialization information and the basic operator of the neural network model.
In some possible embodiments, the conversion module is further configured to determine, based on a type of a network layer within the neural network model, at least one first base operator corresponding to the type; and
and selecting a first basic operator meeting preset requirements from at least one first basic operator, and determining the selected first basic operator as the basic operator matched with the network layer.
In some possible embodiments, the conversion module is further configured to select a first base operator meeting the preset requirement from the at least one first base operator by at least one of:
selecting a first basic operator meeting preset requirements from the first basic operators according to parameters of a network layer of the neural network model;
and operating the neural network model, and selecting the first basic operator meeting the preset requirement based on the operating parameters of at least one first basic operator.
In some possible embodiments, the preset requirements include: run time is minimal and/or memory is minimal.
In some possible embodiments, the conversion module is further configured to run the neural network model using the resources included in the initialization information;
determining first class data and second class data in the process of running the neural network model, wherein the first class data comprise structures and initial weights of all network layers in the neural network model, and the second class data comprise optimized weights of all network layers;
configuring the first type of data into global read-only data, and constructing a first structure body representing the second type of data, wherein the first structure body comprises an input pointer and an output pointer of the second type of data corresponding to a basic operator;
compiling to form an inference function with the first structure as a parameter, and writing a calling method of the basic operator into the inference function;
and the input interface and the output interface are used for generating the inference function.
In some possible embodiments, the apparatus further comprises:
the acquisition module is used for acquiring input data;
and the prediction module is used for processing the input data by utilizing a neural network model in a preset programming language form to obtain a prediction result.
In some possible embodiments, the neural network model includes at least one of a feature extraction neural network model, a classification neural network model, and an object detection neural network model.
In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, will not be described again here.
Embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the above-mentioned method. The computer readable storage medium may be a non-volatile computer readable storage medium.
An embodiment of the present disclosure further provides an electronic device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured as the above method.
The electronic device may be provided as a terminal, server, or other form of device.
Fig. 6 illustrates a block diagram of an electronic device in accordance with an embodiment of the disclosure. For example, the electronic device 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or the like terminal.
Referring to fig. 6, electronic device 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.
The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the electronic device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 800.
The multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 800 is in an operation mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the electronic device 800. For example, the sensor assembly 814 may detect an open/closed state of the electronic device 800, the relative positioning of components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in the position of the electronic device 800 or a component of the electronic device 800, the presence or absence of user contact with the electronic device 800, orientation or acceleration/deceleration of the electronic device 800, and a change in the temperature of the electronic device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices. The electronic device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium, such as the memory 804, is also provided that includes computer program instructions executable by the processor 820 of the electronic device 800 to perform the above-described methods.
Fig. 7 shows a block diagram of another electronic device in accordance with an embodiment of the disclosure. For example, the electronic device 1900 may be provided as a server. Referring to fig. 7, electronic device 1900 includes a processing component 1922 further including one or more processors and memory resources, represented by memory 1932, for storing instructions, e.g., applications, executable by processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the above-described method.
The electronic device 1900 may also include a power component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input/output (I/O) interface 1958. The electronic device 1900 may operate based on an operating system stored in memory 1932, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
In an exemplary embodiment, a non-transitory computer readable storage medium, such as the memory 1932, is also provided that includes computer program instructions executable by the processing component 1922 of the electronic device 1900 to perform the above-described methods.
The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terms used herein were chosen in order to best explain the principles of the embodiments, the practical application, or technical improvements to the techniques in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
Claims (18)
1. A neural network model deployment method, comprising:
loading a neural network model;
configuring initialization information for the neural network model, wherein the initialization information comprises resources allocated for operation and data storage of the neural network model;
and converting the neural network model into a preset programming language form based on the network structure information of the neural network model and the initialization information.
2. The method according to claim 1, wherein the converting the neural network model into a preset programming language form based on the network structure information of the neural network model and the initialization information comprises:
determining a basic operator of network layer matching in the neural network model, wherein the basic operator is represented by a preset programming language;
and converting the neural network model into the form of the preset programming language based on the network structure information, the initialization information and the basic operator of the neural network model.
3. The method of claim 2, wherein determining a base operator of network layer matching within the neural network model comprises:
determining, based on a type of the network layer within the neural network model, at least one first base operator corresponding to the type;
and selecting a first basic operator meeting preset requirements from the at least one first basic operator, and determining the selected first basic operator as the basic operator matched with the network layer.
4. The method according to claim 3, wherein the selecting the first basic operator meeting the preset requirement from the at least one first basic operator comprises at least one of the following ways:
selecting a first basic operator meeting the preset requirement from the first basic operators according to the parameters of the network layer of the neural network model;
and operating the neural network model, and selecting the first basic operator meeting the preset requirement based on the operating parameters of the at least one first basic operator.
5. The method according to claim 3 or 4, wherein the preset requirements comprise: run time is minimal and/or memory is minimal.
6. The method according to any one of claims 2 to 5, wherein the converting the neural network model into the form of the preset programming language based on the network structure information, the initialization information and the base operator of the neural network model comprises:
running the neural network model by using the resources included in the initialization information;
determining first class data and second class data in the process of operating the neural network model, wherein the first class data comprises the structure and the initial weight of each network layer in the neural network model, and the second class data comprises the optimized weight of each network layer;
configuring the first type of data into global read-only data, and constructing a first structure body representing the second type of data, wherein the first structure body comprises an input pointer and an output pointer of the second type of data corresponding to the basic operator;
compiling to form an inference function with the first structure as a parameter, and writing a calling method of the basic operator into the inference function;
and the input interface and the output interface are used for generating the inference function.
7. The method according to any one of claims 1-6, further comprising:
acquiring input data;
and processing the input data by utilizing the neural network model in the preset programming language form to obtain a prediction result.
8. The method of any one of claims 1-7, wherein the neural network model comprises at least one of a feature extraction neural network model, a classification neural network model, and an object detection neural network model.
9. A neural network model deployment device, comprising:
the loading module is used for loading the neural network model;
the configuration module is used for configuring initialization information for the neural network model, wherein the initialization information comprises resources allocated for operation and data storage of the neural network model;
and the conversion module is used for converting the neural network model into a preset programming language form based on the network structure information of the neural network model and the initialization information.
10. The apparatus of claim 9, wherein the transformation module is further configured to determine a base operator of network layer matching within the neural network model, the base operator being represented by a preset programming language; and
and converting the neural network model into the form of the preset programming language based on the network structure information, the initialization information and the basic operator of the neural network model.
11. The apparatus of claim 10, wherein the transformation module is further configured to determine at least one first base operator corresponding to a type of the network layer in the neural network model based on the type; and
and selecting a first basic operator meeting preset requirements from the at least one first basic operator, and determining the selected first basic operator as the basic operator matched with the network layer.
12. The apparatus of claim 11, wherein the converting module is further configured to select a first operator meeting a preset requirement from the at least one first operator by at least one of:
selecting a first basic operator meeting the preset requirement from the first basic operators according to the parameters of the network layer of the neural network model;
and operating the neural network model, and selecting the first basic operator meeting the preset requirement based on the operating parameters of the at least one first basic operator.
13. The apparatus according to claim 11 or 12, wherein the preset requirements comprise: run time is minimal and/or memory is minimal.
14. The apparatus according to any one of claims 10-13, wherein the transformation module is further configured to run the neural network model using a resource included in the initialization information;
determining first class data and second class data in the process of operating the neural network model, wherein the first class data comprises the structure and the initial weight of each network layer in the neural network model, and the second class data comprises the optimized weight of each network layer;
configuring the first type of data into global read-only data, and constructing a first structure body representing the second type of data, wherein the first structure body comprises an input pointer and an output pointer of the second type of data corresponding to the basic operator;
compiling to form an inference function with the first structure as a parameter, and writing a calling method of the basic operator into the inference function;
and the input interface and the output interface are used for generating the inference function.
15. The apparatus according to any one of claims 9-14, wherein the apparatus further comprises:
the acquisition module is used for acquiring input data;
and the prediction module is used for processing the input data by utilizing the neural network model in the preset programming language form to obtain a prediction result.
16. The apparatus of any one of claims 9-15, wherein the neural network model comprises at least one of a feature extraction neural network model, a classification neural network model, and an object detection neural network model.
17. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to invoke the memory-stored instructions to perform the method of any one of claims 1-8.
18. A computer-readable storage medium having computer program instructions stored thereon, which when executed by a processor implement the method of any one of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010052349.5A CN111222637B (en) | 2020-01-17 | 2020-01-17 | Neural network model deployment method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010052349.5A CN111222637B (en) | 2020-01-17 | 2020-01-17 | Neural network model deployment method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111222637A true CN111222637A (en) | 2020-06-02 |
CN111222637B CN111222637B (en) | 2023-11-28 |
Family
ID=70829556
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010052349.5A Active CN111222637B (en) | 2020-01-17 | 2020-01-17 | Neural network model deployment method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111222637B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111814948A (en) * | 2020-06-18 | 2020-10-23 | 浙江大华技术股份有限公司 | Operation method and operation device of neural network and computer readable storage medium |
CN112101529A (en) * | 2020-10-14 | 2020-12-18 | 杭州海康威视数字技术股份有限公司 | Cross-platform deployment method and framework for neural network model inference |
CN112329467A (en) * | 2020-11-03 | 2021-02-05 | 腾讯科技(深圳)有限公司 | Address recognition method and device, electronic equipment and storage medium |
CN112947933A (en) * | 2021-02-24 | 2021-06-11 | 上海商汤智能科技有限公司 | Operator execution method and device, computer equipment and storage medium |
CN113190877A (en) * | 2021-04-29 | 2021-07-30 | 网易(杭州)网络有限公司 | Model loading method and device, readable storage medium and electronic equipment |
CN113780513A (en) * | 2020-06-10 | 2021-12-10 | 杭州海康威视数字技术股份有限公司 | Network model quantification and inference method and device, electronic equipment and storage medium |
WO2021248443A1 (en) * | 2020-06-12 | 2021-12-16 | Qualcomm Incorporated | Generate source code to build secure machine learning engine for edge devices and existing toolchains |
CN113849314A (en) * | 2021-09-30 | 2021-12-28 | 支付宝(杭州)信息技术有限公司 | Data processing model deployment method and device |
CN114330668A (en) * | 2021-12-31 | 2022-04-12 | 成都商汤科技有限公司 | Model processing method and device, electronic equipment and computer storage medium |
CN115099399A (en) * | 2022-06-27 | 2022-09-23 | 清华大学 | Neural network model deployment method and device, electronic equipment and storage medium |
WO2024046463A1 (en) * | 2022-09-02 | 2024-03-07 | 深圳忆海原识科技有限公司 | Model construction method, apparatus and platform, electronic device and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108122031A (en) * | 2017-12-20 | 2018-06-05 | 杭州国芯科技股份有限公司 | A kind of neutral net accelerator architecture of low-power consumption |
CN108229647A (en) * | 2017-08-18 | 2018-06-29 | 北京市商汤科技开发有限公司 | The generation method and device of neural network structure, electronic equipment, storage medium |
CN109409510A (en) * | 2018-09-14 | 2019-03-01 | 中国科学院深圳先进技术研究院 | Neuron circuit, chip, system and method, storage medium |
CN109919308A (en) * | 2017-12-13 | 2019-06-21 | 腾讯科技(深圳)有限公司 | A kind of neural network model dispositions method, prediction technique and relevant device |
US20190266504A1 (en) * | 2019-05-09 | 2019-08-29 | Intel Corporation | Using computational cost and instantaneous load analysis for intelligent deployment of neural networks on multiple hardware executors |
CN110659070A (en) * | 2018-06-29 | 2020-01-07 | 赛灵思公司 | High-parallelism computing system and instruction scheduling method thereof |
-
2020
- 2020-01-17 CN CN202010052349.5A patent/CN111222637B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229647A (en) * | 2017-08-18 | 2018-06-29 | 北京市商汤科技开发有限公司 | The generation method and device of neural network structure, electronic equipment, storage medium |
CN109919308A (en) * | 2017-12-13 | 2019-06-21 | 腾讯科技(深圳)有限公司 | A kind of neural network model dispositions method, prediction technique and relevant device |
CN108122031A (en) * | 2017-12-20 | 2018-06-05 | 杭州国芯科技股份有限公司 | A kind of neutral net accelerator architecture of low-power consumption |
CN110659070A (en) * | 2018-06-29 | 2020-01-07 | 赛灵思公司 | High-parallelism computing system and instruction scheduling method thereof |
CN109409510A (en) * | 2018-09-14 | 2019-03-01 | 中国科学院深圳先进技术研究院 | Neuron circuit, chip, system and method, storage medium |
US20190266504A1 (en) * | 2019-05-09 | 2019-08-29 | Intel Corporation | Using computational cost and instantaneous load analysis for intelligent deployment of neural networks on multiple hardware executors |
Non-Patent Citations (4)
Title |
---|
TENGTENG WANG AND JUNTAO FEI: "Adaptive Neural Control of Active Power Filter Using Fuzzy Sliding Mode Controller" * |
XIAOYU YU ET AL.: "A Data-Center FPGA Acceleration Platform for Convolutional Neural Network" * |
王秉睿: "神经网络专用编程语言" * |
解谦 等: "移动智能终端基于神经网络的人工智能技术与应用" * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113780513B (en) * | 2020-06-10 | 2024-05-03 | 杭州海康威视数字技术股份有限公司 | Network model quantization and reasoning method and device, electronic equipment and storage medium |
CN113780513A (en) * | 2020-06-10 | 2021-12-10 | 杭州海康威视数字技术股份有限公司 | Network model quantification and inference method and device, electronic equipment and storage medium |
WO2021248443A1 (en) * | 2020-06-12 | 2021-12-16 | Qualcomm Incorporated | Generate source code to build secure machine learning engine for edge devices and existing toolchains |
CN111814948A (en) * | 2020-06-18 | 2020-10-23 | 浙江大华技术股份有限公司 | Operation method and operation device of neural network and computer readable storage medium |
CN111814948B (en) * | 2020-06-18 | 2021-07-13 | 浙江大华技术股份有限公司 | Operation method and operation device of neural network and computer readable storage medium |
CN112101529A (en) * | 2020-10-14 | 2020-12-18 | 杭州海康威视数字技术股份有限公司 | Cross-platform deployment method and framework for neural network model inference |
CN112329467B (en) * | 2020-11-03 | 2022-09-30 | 腾讯科技(深圳)有限公司 | Address recognition method and device, electronic equipment and storage medium |
CN112329467A (en) * | 2020-11-03 | 2021-02-05 | 腾讯科技(深圳)有限公司 | Address recognition method and device, electronic equipment and storage medium |
CN112947933A (en) * | 2021-02-24 | 2021-06-11 | 上海商汤智能科技有限公司 | Operator execution method and device, computer equipment and storage medium |
CN113190877A (en) * | 2021-04-29 | 2021-07-30 | 网易(杭州)网络有限公司 | Model loading method and device, readable storage medium and electronic equipment |
CN113849314A (en) * | 2021-09-30 | 2021-12-28 | 支付宝(杭州)信息技术有限公司 | Data processing model deployment method and device |
CN114330668A (en) * | 2021-12-31 | 2022-04-12 | 成都商汤科技有限公司 | Model processing method and device, electronic equipment and computer storage medium |
CN115099399A (en) * | 2022-06-27 | 2022-09-23 | 清华大学 | Neural network model deployment method and device, electronic equipment and storage medium |
WO2024046463A1 (en) * | 2022-09-02 | 2024-03-07 | 深圳忆海原识科技有限公司 | Model construction method, apparatus and platform, electronic device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111222637B (en) | 2023-11-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111222637B (en) | Neural network model deployment method and device, electronic equipment and storage medium | |
CN112947935B (en) | Operation method and device, electronic equipment and storage medium | |
CN113538517B (en) | Target tracking method and device, electronic equipment and storage medium | |
CN110889469A (en) | Image processing method and device, electronic equipment and storage medium | |
CN113065591B (en) | Target detection method and device, electronic equipment and storage medium | |
JP2021517282A (en) | Network modules, allocation methods and devices, electronic devices and storage media | |
CN111242303B (en) | Network training method and device, and image processing method and device | |
CN112328398B (en) | Task processing method and device, electronic equipment and storage medium | |
CN109934240B (en) | Feature updating method and device, electronic equipment and storage medium | |
CN110968362B (en) | Application running method, device and storage medium | |
CN109685041B (en) | Image analysis method and device, electronic equipment and storage medium | |
CN113807253A (en) | Face recognition method and device, electronic equipment and storage medium | |
CN110633715B (en) | Image processing method, network training method and device and electronic equipment | |
CN107463372B (en) | Data-driven page updating method and device | |
CN109447258B (en) | Neural network model optimization method and device, electronic device and storage medium | |
CN111488964A (en) | Image processing method and device and neural network training method and device | |
CN111694571B (en) | Compiling method and device | |
CN111488267B (en) | Interface test script generation method and device and electronic equipment | |
CN111582265A (en) | Text detection method and device, electronic equipment and storage medium | |
CN111046780A (en) | Neural network training and image recognition method, device, equipment and storage medium | |
CN107885464B (en) | Data storage method, device and computer readable storage medium | |
CN112734015B (en) | Network generation method and device, electronic equipment and storage medium | |
CN108549570B (en) | User interface updating method and device | |
CN113869295A (en) | Object detection method and device, electronic equipment and storage medium | |
CN113220280A (en) | Application generation method and device for generating application |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |