CN111222637B

CN111222637B - Neural network model deployment method and device, electronic equipment and storage medium

Info

Publication number: CN111222637B
Application number: CN202010052349.5A
Authority: CN
Inventors: 李南
Original assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Current assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Priority date: 2020-01-17
Filing date: 2020-01-17
Publication date: 2023-11-28
Anticipated expiration: 2040-01-17
Also published as: CN111222637A

Abstract

The disclosure relates to a neural network model deployment method and device, electronic equipment and storage medium, wherein the method comprises the following steps: loading a neural network model; configuring initialization information for the neural network model, wherein the initialization information comprises resources allocated for operation and data storage of the neural network model; and converting the neural network model into a preset programming language form based on the network structure information of the neural network model and the initialization information. The embodiment of the disclosure can realize the expansion capability of the application platform and improve the applicability of the neural network model.

Description

Neural network model deployment method and device, electronic equipment and storage medium

Technical Field

The disclosure relates to the technical field of artificial intelligence, and in particular relates to a neural network model deployment method and device, electronic equipment and a storage medium.

Background

Neural networks such as deep learning require engineering cooperation in various aspects when applied specifically. The deployment scheme of the model relates to how the model is used by programs, and plays a very key role in the whole application scene of the neural network learning.

Currently, the training of neural networks is different from the environment in which they are deployed, which requires more efficient performance.

Disclosure of Invention

The disclosure provides a technical scheme for neural network model deployment.

According to an aspect of the present disclosure, there is provided a neural network model deployment method, including:

loading a neural network model;

configuring initialization information for the neural network model, wherein the initialization information comprises resources allocated for operation and data storage of the neural network model;

and converting the neural network model into a preset programming language form based on the network structure information of the neural network model and the initialization information.

In some possible embodiments, the converting the neural network model into a preset programming language form based on the network structure information of the neural network model and the initialization information includes:

determining a basic operator matched with a network layer in the neural network model, wherein the basic operator is represented by a preset programming language;

and converting the neural network model into the form of the preset programming language based on the network structure information of the neural network model, the initialization information and the basic operator.

In some possible implementations, the determining the base operator of the network layer match within the neural network model includes:

determining at least one first base operator corresponding to a type of the network layer within the neural network model based on the type;

selecting a first basic operator meeting preset requirements from the at least one first basic operator, and determining the selected first basic operator as the basic operator matched with the network layer.

In some possible embodiments, the selecting a first basic operator that meets a preset requirement from the at least one first basic operator includes at least one of the following manners:

selecting a first basic operator which meets the preset requirement from the first basic operators according to the parameters of the network layer of the neural network model;

and operating the neural network model, and selecting a first basic operator conforming to the preset requirement based on the operation parameters of the at least one first basic operator.

In some possible embodiments, the preset requirements include: the run time is the shortest and/or the occupied memory is the smallest.

In some possible embodiments, the converting the neural network model into the form of the preset programming language based on the network structure information, the initialization information, and the basic operator of the neural network model includes:

Operating the neural network model by utilizing resources included in the initialization information;

determining first class data and second class data in the process of operating the neural network model, wherein the first class data comprises structures and initial weights of all network layers in the neural network model, and the second class data comprises the weights of all network layers after optimization;

configuring the first type of data into global read-only data, and constructing a first structure body representing the second type of data, wherein the first structure body comprises an input pointer and an output pointer, corresponding to the basic operator, of the second type of data;

compiling an inference function taking the first structure body as a parameter, and writing a calling method of the basic operator into the inference function;

and generating an input interface and an output interface of the reasoning function.

In some possible embodiments, the method further comprises:

acquiring input data;

and processing the input data by using the neural network model in the preset programming language form to obtain a prediction result.

In some possible implementations, the neural network model includes at least one of a feature extraction neural network model, a classification neural network model, and a target detection neural network model.

According to a second aspect of the present disclosure, there is provided a neural network model deployment apparatus, comprising:

the loading module is used for loading the neural network model;

the configuration module is used for configuring initialization information for the neural network model, wherein the initialization information comprises resources distributed for operation and data storage of the neural network model;

the conversion module is used for converting the neural network model into a preset programming language form based on the network structure information of the neural network model and the initialization information.

In some possible implementations, the conversion module is further configured to determine a base operator of network layer matches within the neural network model, the base operator being represented by a preset programming language; and

In some possible implementations, the conversion module is further configured to determine, based on a type of the network layer within the neural network model, at least one first base operator corresponding to the type; and

In some possible implementations, the conversion module is further configured to select a first basic operator that meets a preset requirement from the at least one first basic operator by at least one of the following manners:

In some possible implementations, the conversion module is further configured to run the neural network model using resources included in the initialization information;

In some possible embodiments, the apparatus further comprises:

the acquisition module is used for acquiring input data;

and the prediction module is used for processing the input data by utilizing the neural network model in the preset programming language form to obtain a prediction result.

According to a third aspect of the present disclosure, there is provided an electronic device comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to invoke the instructions stored in the memory to perform the method of any of the first aspects.

According to a fourth aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method of any one of the first aspects.

In the embodiment of the disclosure, in the process of deploying the neural network model to the application platform, the neural network model may be loaded first, and based on the initialization information configured for the neural network, the neural network model may be converted into a preset programming language form, that is, a code program in the preset programming language form is generated. Through the configuration of the embodiment, the platform which does not support the reading of the neural network file can apply the neural network model by generating the code corresponding to the neural network model, the expandability and the application performance of the platform are improved, meanwhile, the conversion of the code does not need extra time, the corresponding code can be generated in the process of configuring the initialization information, and the network deployment time is saved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the technical aspects of the disclosure.

FIG. 1 illustrates a flow chart of a neural network model deployment method, according to an embodiment of the present disclosure;

FIG. 2 illustrates a flowchart for converting a neural network model into a preset programming language form in accordance with an embodiment of the present disclosure;

fig. 3 shows a flowchart of step S31 in an embodiment according to the present disclosure;

FIG. 4 shows a flowchart of step S32 in an embodiment according to the present disclosure;

FIG. 5 illustrates a block diagram of a neural network model deployment device, according to an embodiment of the present disclosure;

FIG. 6 illustrates a block diagram of an electronic device, according to an embodiment of the present disclosure;

fig. 7 illustrates a block diagram of another electronic device, according to an embodiment of the present disclosure.

Detailed Description

Various exemplary embodiments, features and aspects of the disclosure will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

Furthermore, numerous specific details are set forth in the following detailed description in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements, and circuits well known to those skilled in the art have not been described in detail in order not to obscure the present disclosure.

The execution body of the neural network model deployment method provided by the embodiments of the present disclosure may be any electronic device, for example, the neural network model deployment method may be executed by a terminal device or a server, where the terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a personal digital processing (Personal Digital Assistant, PDA), a handheld device, a computing device, an in-vehicle device, a wearable device, or the like. The server may be a local server or a cloud server. In some possible implementations, the neural network model deployment method may be implemented by way of a processor invoking computer readable instructions stored in a memory.

Fig. 1 shows a flowchart of a neural network model deployment method according to an embodiment of the present disclosure, as shown in fig. 1, the neural network model deployment method includes:

s10: loading a neural network model;

in some possible implementations, the embodiments of the present disclosure may implement deploying the neural network model in an application platform, where the application platform may be an application platform of a terminal device, or may be an application platform in a server, which is not specifically limited in this disclosure. The following describes embodiments of the present disclosure as being performed by an electronic device, but is not a specific limitation of the present disclosure.

In some possible embodiments, the neural network model may be a neural network model implementing any function, and may include, for example, at least one of a feature extraction neural network model implementing feature information extraction of input data, a target detection neural network model implementing object detection, a neural network model implementing target segmentation, a neural network model implementing target classification, and a neural network model of natural language processing. The foregoing is illustrative only and is not to be construed as limiting the present disclosure in any way.

In some possible embodiments, in the case of deploying a corresponding neural network model in the electronic device, the neural network model may be loaded first, that is, a file of the neural network model is loaded into a program of the application platform. The neural network model may include a first file describing a network structure of the neural network, and a second file storing parameter information such as weights used in the neural network. For example, the neural network model in embodiments of the present disclosure may be in a caffedel format. The neural network model may include two files, with the first file being lenet. Weights of the complex. The foregoing is merely exemplary and is not intended to be limiting in any way.

S20: configuring initialization information for the neural network model, wherein the initialization information comprises resources allocated for the operation and data storage of the neural network model;

in some possible embodiments, the initialization information may be configured for the neural network model during loading of the neural network model, or in the event that loading of the neural network model is complete. The configuration initialization information may include allocating corresponding resources, such as at least one of configuration memory and video memory, for operation and data storage of the neural network model. In the process of operating the neural network model, a plurality of data structures are required to support, and each data structure corresponds to a computing resource, such as a memory, a video memory and the like, and can be used for storing input and output data, the weight of a network, the parameters of the network and the like.

The embodiment of the disclosure can be used for reasoning and executing the operation process of the neural network based on the network structure of the neural network, calculating the resources required by the execution process of the neural network model based on the data volume and occupied memory in the reasoning process, and executing the initialized configuration for the neural network model.

S30: based on the network structure information and the initialization information of the neural network model, converting the neural network model into a preset programming language form.

The embodiment of the disclosure can convert the neural network model into a form of a preset programming language in the process of configuring the initialization information or in the condition that the configuration of the initialization information is completed. For example, the preset programming language of embodiments of the present disclosure may include at least one of a C language and a c++ language. By converting the neural network model into a preset coding language, the neural network model can be applicable to different application platforms, and the applicability of the neural network is improved. In addition, in the case where steps S20 and S30 are simultaneously performed, conversion of a programming language can be performed without wasting time, saving time costs.

In the embodiment of the disclosure, converting the neural network model into a preset programming language form may include implementing functions of each network layer in the neural network model by using the preset programming language, generating calls for implementing functions corresponding to each network layer by using the preset programming language, thereby implementing functions of the neural network, configuring an input/output interface by using the preset programming language, and the like.

Through the configuration, the neural network model can be converted into the form of a preset programming language, and the neural network model can be suitable for various application platforms. In the case of converting the neural network model into a preset programming language form, in the application platform, the processing of the input data may be performed using the neural network model in the preset programming language form, wherein the input data may be determined according to a function of the neural network, and for example, the input data may include at least one of voice, text, image, video, and the like. And executing the processing of the input data through a neural network model in a preset programming language form to obtain a prediction result.

For example, the neural network model in the embodiment of the present disclosure may be a neural network mnist for identifying handwritten numbers, and the neural network model may be loaded into a program first, for example, lenet. Prototxt and lenet. Caffemul are loaded into the program and converted into a preset coding language form, in an application process, input data in a picture form may be transferred into the program, and after reasoning is performed, the program returns a prediction result for predicting numbers in the picture. Where the prediction result may include 10 floating point numbers, the value of which floating point number (counting from 0 in programming) represents the probability of a number of numbers. If we pass in the "0" picture, the program will return (1,0,0,0,0,0,0,0,0,0) and pass in the "3" picture, the program will return (0,0,0,1,0,0,0,0,0,0), the other numbers will be the same and the corresponding position will be 1.

The following describes in detail the specific procedures of the embodiments of the present disclosure with reference to the accompanying drawings. In the embodiment of the disclosure, under the condition that the file of the neural network model is loaded to the program, corresponding resources can be allocated for the operation of the neural network, and meanwhile, codes such as C codes related to the execution process of the neural network model are required to be generated, namely, the neural network model is converted into a preset programming language form.

Fig. 2 illustrates a flowchart for converting a neural network model into a preset programming language form in accordance with an embodiment of the present disclosure. The method for converting the neural network model into the preset programming language form based on the network structure information and the initialization information of the neural network model comprises the following steps:

s31: determining a basic operator matched with a network layer in the neural network model, wherein the basic operator is represented by a preset programming language;

s32: based on the network structure information, the initialization information and the basic operator of the neural network model, the neural network model is converted into a form of a preset programming language.

In some possible implementations, in generating a preset programming language for the neural network model, matched base operators may be selected for each network layer within the network structure in the neural network model.

The embodiment of the disclosure may configure basic operators with different functions in an application platform in advance, each basic operator is written in a preset programming language, for example, the basic operators may include a Pooling operator pooling_3x3_pad1x1_method0 for performing Pooling operation, a Convolution operator Convoltion_3x3_pad1x1_method 0 for performing Convolution operation, a reduce_mean_method 0), and may further include a full-join operator, a classification operator, and the like.

Under the condition of loading the neural network model, a basic operator matched with the network layers can be obtained according to the types of the network layers in the network structure, wherein the types and the functions of the network layers can be corresponding. For example, the network layer may include a convolutional layer, a pooling layer, a fully-connected layer, and the like. In the embodiment of the disclosure, the basic operators for realizing the same function can at least comprise one, and code programs in different basic operators are different.

Fig. 3 shows a flowchart of step S31 in an embodiment according to the present disclosure. Wherein determining a basis operator for network layer matching within the neural network model comprises:

s311: determining at least one first base operator corresponding to the type based on the type of the network layer within the neural network model;

s312: selecting a first basic operator meeting preset requirements from at least one first basic operator, and determining the selected first basic operator as a basic operator matched with a network layer.

In some possible embodiments, in the case of determining the type (function) of the network layer in the neural network, the same basic operator as the function of the network layer of the neural network may be randomly selected from the basic operator library as the matched basic operator.

In some possible embodiments, in the case that there are multiple basic operators for implementing the functions of the network layer, one basic operator may be randomly selected as a matched basic operator, or a basic operator meeting the preset requirement may be selected as a matched basic operator. The disclosed embodiments refer to a base operator matching a type of a network layer as a first base operator.

The selecting the first basic operator meeting the preset requirement from the plurality of first basic operators matched by the network layer may include at least one of the following modes:

a) Selecting a first basic operator meeting preset requirements from the first basic operators according to parameters of a network layer of the neural network model;

in some possible embodiments, according to parameters used by the network layer, simulation operation may be directly performed through each first basic operator to obtain an occupation amount parameter of the operation resource, where the occupation amount parameter may include operation time and occupied memory. That is, according to the parameters used by the network layer, the embodiments of the present disclosure may directly perform a simulation operation through each first basic operator to obtain an occupation amount parameter of an operation resource, where the occupation amount parameter may include an operation time and an occupied memory. Further, the first basic operators meeting the preset requirements can be determined according to the occupation amount parameters of the operation resources of each first basic operator. The embodiment of the disclosure can determine the first basic operator with the shortest running time and/or the least occupied memory as the first basic operator meeting the preset requirement, thereby selecting the optimal basic operator matched with the network layer.

B) And operating the neural network model, and selecting a first basic operator meeting preset requirements based on the operation parameters of at least one first basic operator.

In some possible embodiments, the neural network in the form of unconverted programming language may be run, and the running resources occupied by the first basic operators corresponding to each network layer in the actual running process when the corresponding data is processed may be obtained, and the same running resources may also include running time and occupied memory, and further the first basic operator meeting the preset requirement may be determined according to the running resources of each first basic operator. The embodiment of the disclosure can determine the first basic operator with the shortest running time and/or the least occupied memory as the first basic operator meeting the preset requirement, thereby selecting the optimal basic operator matched with the network layer.

By the method, the basic operator matched with each network layer in the neural network model can be determined, and the realization of the preset programming language of each network layer is realized.

Under the condition that the basic operators matched with each network layer are obtained, a framework of a main reasoning function of the neural network model can be further generated, and calling among the basic operators corresponding to each network layer is realized.

Fig. 4 is a flowchart illustrating step S32 in an embodiment of the present disclosure, in which converting a neural network model into a form of a preset programming language based on network structure information, initialization information, and a basic operator of the neural network model, including:

s321: operating the neural network model by utilizing resources included in the initialization information;

in some possible embodiments, in the process of performing the conversion of the neural network of the preset programming voice, the loaded neural network model may be performed by using the configured initialization information, and the process may be performed separately or simultaneously with the determination process of the basic operator, which is not particularly limited in this disclosure.

S322: determining first class data and second class data in the process of operating the neural network model, wherein the first class data comprises structures and initial weights of all network layers in the neural network model, and the second class data comprises weights of all network layers after optimization;

in some possible embodiments, the first type of data and the second type of data used or generated by the neural network model during the operation may be predetermined, where the first type of data is global data that does not change during the operation, such as a structure of a network layer, a size of a convolution kernel, a size of a pooling kernel, an initial weight of a network, and the like, and the second type of data is data that may be modified during the operation, such as a weight of the optimized network, an input/output space of each operator, and the like, which are only exemplary and not used as a specific limitation of the present disclosure.

S323: configuring the first type of data as global read-only data, and constructing a first structure body representing the second type of data, wherein the first structure body comprises an input pointer and an output pointer, which correspond to the second type of data and a basic operator;

in some possible implementations, the various pieces of data described in S32 may be converted into a preset programming language representation. Wherein, the first type data which does not change the running process can be written as global read-only data. For the second type of data, a first structure, such as a C structure (referred to herein as an inference structure), may be defined to represent information that varies during the inference process, and may include pointers to inputs and outputs that propagate between the base operators, as well as other necessary information.

S324: compiling to form an inference function taking the first structural body as a parameter, and writing a calling method of a basic operator into the inference function;

in some possible embodiments, a main inference function of the neural network may also be generated, for example, an inference code is generated, and a parameter is compiled according to a syntax of a preset programming language to generate an inference function framework describing the first structure in step S33. The calling method of the basic operator matched with each network layer can be written into the reasoning function by using a preset programming language. The source of the function parameter of each basic operator may include the generated global read-only data and the first structure.

S325: an input interface and an output interface of the inference function are generated.

In some possible implementations, the goal of performing the operation of the neural network is to provide an input and then obtain an output. Therefore, the neural network model is correspondingly required to be built with external input and output, and the setting input and the output interface are correspondingly required to be provided. Wherein parameters required for the input interface and the output interface should be in the global read-only data and the first structure defined above.

In some possible embodiments, there may be multiple network layers or scenarios where the network models cooperate, and the defined global read-only data and the first structure may be generally identified in a hidden manner, to provide a creation function and a destruction function of the function through the name reaction model. The same process (function name reaction model function) is then performed on the set input interface and the acquire output interface of the inference function, so that a plurality of models can be used in the program.

The functions of the network module of the neural network can be related with the functions of the corresponding preset programming language in the mode. In a specific application, a model name (such as a C string) may be bound to the generated inference function (by a simple string structure mapping table, the structure contains function pointers that create, destroy, and set the input-to-output function pointers). So that the higher level logic isolates the model function from the underlying C-function. That is, the embodiment of the present disclosure may be implemented by binding the inference function generated by each neural network model with the function corresponding to the neural network model, to form a mapping table that represents the above binding relationship, where the mapping table may further include pointers for executing functions of creating and destroying the global read-only data and the first structure, and pointers for setting input of the inference parameters and obtaining output functions (pointers of the input interface and the output interface). Therefore, the execution of each neural network model can be conveniently realized, and the neural network models are not mutually interfered.

In addition, in the embodiment of the present disclosure, if the platform environment of the actual application does not support the conversion of the code, the conversion of the neural network model may be implemented by a server or other devices, and the converted code is loaded into the application platform. The code conversion can be realized by adopting an environment similar to an application platform or adopting an off-line method.

In addition, if the platform of the actual application does not support the allocation of the dynamic memory, the pre-configured resource management information can be received before the neural network model is loaded or before the initialization information is configured, so that the configuration of the resources is realized.

In summary, in the embodiment of the disclosure, in the process of deploying the neural network model to the application platform, the neural network model may be loaded first, and based on the neural network configuration initialization information, the neural network model may be converted into a preset programming language form, that is, a code program in the preset programming language form is generated. By the configuration of the embodiment, the platform which does not support the neural network file reading can apply the neural network model by generating the code corresponding to the neural network model, so that the expandability and the application performance of the platform are improved, meanwhile, the conversion of the code does not need extra time, the corresponding code can be generated in the process of configuring the initialization information, the network deployment time is saved, for example, the initialization time can be reduced, and the initialization time is especially important when the neural network of the embodiment is applied to a vehicle or other scenes with high time response requirements (such as the situation that the device needs to be repeatedly started and closed). In addition, the embodiment of the disclosure can also solve the problem that the application environment has no file IO.

It will be appreciated by those skilled in the art that in the above-described method of the specific embodiments, the written order of steps is not meant to imply a strict order of execution but rather should be construed according to the function and possibly inherent logic of the steps.

It will be appreciated that the above-mentioned method embodiments of the present disclosure may be combined with each other to form a combined embodiment without departing from the principle logic, and are limited to the description of the present disclosure.

In addition, the disclosure further provides a neural network model deployment device, an electronic device, a computer readable storage medium and a program, which can be used for implementing any of the neural network model deployment methods provided in the disclosure, and corresponding technical schemes and descriptions and corresponding descriptions of method parts are omitted.

Fig. 5 illustrates a block diagram of a neural network model deployment apparatus, according to an embodiment of the present disclosure, as illustrated in fig. 5, the neural network model deployment apparatus may include:

a loading module 10 for loading the neural network model;

a configuration module 20, configured to configure initialization information for the neural network model, where the initialization information includes resources allocated for operation and data storage of the neural network model;

The conversion module 30 is configured to convert the neural network model into a preset programming language based on the network structure information and the initialization information of the neural network model.

In some possible implementations, the conversion module is further configured to determine a base operator of the network layer match within the neural network model, the base operator being represented by a preset programming language; and

based on the network structure information, the initialization information and the basic operator of the neural network model, the neural network model is converted into a form of a preset programming language.

In some possible implementations, the conversion module is further configured to determine, based on a type of network layer within the neural network model, at least one first base operator corresponding to the type; and

selecting a first basic operator meeting preset requirements from at least one first basic operator, and determining the selected first basic operator as a basic operator matched with a network layer.

In some possible embodiments, the conversion module is further configured to select a first basic operator that meets a preset requirement from at least one first basic operator by at least one of the following manners:

selecting a first basic operator meeting preset requirements from the first basic operators according to parameters of a network layer of the neural network model;

And operating the neural network model, and selecting a first basic operator meeting preset requirements based on the operation parameters of at least one first basic operator.

In some possible embodiments, the conversion module is further configured to run the neural network model using the resources included in the initialization information;

determining first class data and second class data in the process of operating the neural network model, wherein the first class data comprises structures and initial weights of all network layers in the neural network model, and the second class data comprises weights of all network layers after optimization;

configuring the first type of data as global read-only data, and constructing a first structure body representing the second type of data, wherein the first structure body comprises an input pointer and an output pointer, which correspond to the second type of data and a basic operator;

compiling to form an inference function taking the first structural body as a parameter, and writing a calling method of a basic operator into the inference function;

an input interface and an output interface of the inference function are generated.

In some possible embodiments, the apparatus further comprises:

the acquisition module is used for acquiring input data;

And the prediction module is used for processing the input data by utilizing a neural network model in a preset programming language form to obtain a prediction result.

In some embodiments, functions or modules included in an apparatus provided by the embodiments of the present disclosure may be used to perform a method described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.

The disclosed embodiments also provide a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method. The computer readable storage medium may be a non-volatile computer readable storage medium.

The embodiment of the disclosure also provides an electronic device, which comprises: a processor; a memory for storing processor-executable instructions; wherein the processor is configured as the method described above.

The electronic device may be provided as a terminal, server or other form of device.

Fig. 6 shows a block diagram of an electronic device, according to an embodiment of the disclosure. For example, electronic device 800 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or the like.

Referring to fig. 6, an electronic device 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.

The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interactions between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operations at the electronic device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the electronic device 800.

The multimedia component 808 includes a screen between the electronic device 800 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. When the electronic device 800 is in an operational mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 further includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.

The sensor assembly 814 includes one or more sensors for providing status assessment of various aspects of the electronic device 800. For example, the sensor assembly 814 may detect an on/off state of the electronic device 800, a relative positioning of the components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in position of the electronic device 800 or a component of the electronic device 800, the presence or absence of a user's contact with the electronic device 800, an orientation or acceleration/deceleration of the electronic device 800, and a change in temperature of the electronic device 800. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate communication between the electronic device 800 and other devices, either wired or wireless. The electronic device 800 may access a wireless network based on a communication standard, such as WiFi,2G, or 3G, or a combination thereof. In one exemplary embodiment, the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 804 including computer program instructions executable by processor 820 of electronic device 800 to perform the above-described methods.

Fig. 7 illustrates a block diagram of another electronic device, according to an embodiment of the present disclosure. For example, electronic device 1900 may be provided as a server. Referring to FIG. 7, electronic device 1900 includes a processing component 1922 that further includes one or more processors and memory resources represented by memory 1932 for storing instructions, such as application programs, that can be executed by processing component 1922. The application programs stored in memory 1932 may include one or more modules each corresponding to a set of instructions. Further, processing component 1922 is configured to execute instructions to perform the methods described above.

The electronic device 1900 may also include a power component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input/output (I/O) interface 1958. The electronic device 1900 may operate based on an operating system stored in memory 1932, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, or the like.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 1932, including computer program instructions executable by processing component 1922 of electronic device 1900 to perform the methods described above.

The present disclosure may be a system, method, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement aspects of the present disclosure.

The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.

The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.

Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the technical improvement of the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A neural network model deployment method, comprising:

loading a neural network model;

based on the network structure information of the neural network model and the initialization information, converting the neural network model into a preset programming language form;

the network structure information based on the neural network model and the initialization information convert the neural network model into a preset programming language form, and the method comprises the following steps:

converting the neural network model into a form of the preset programming language based on network structure information of the neural network model, the initialization information and the basic operator;

the converting the neural network model into the form of the preset programming language based on the network structure information, the initialization information and the basic operator of the neural network model comprises the following steps:

2. The method of claim 1, wherein the determining the basis operator for network layer matching within the neural network model comprises:

3. The method according to claim 2, wherein the selecting a first basic operator from the at least one first basic operator, which meets a preset requirement, comprises at least one of the following ways:

4. A method according to claim 2 or 3, wherein the preset requirements include: the run time is the shortest and/or the occupied memory is the smallest.

5. The method according to claim 1, wherein the method further comprises:

acquiring input data;

6. The method of claim 1, wherein the neural network model comprises at least one of a feature extraction neural network model, a classification neural network model, and a target detection neural network model.

7. A neural network model deployment apparatus, comprising:

the loading module is used for loading the neural network model;

the conversion module is used for converting the neural network model into a preset programming language form based on the network structure information of the neural network model and the initialization information;

the conversion module is further used for determining a basic operator matched with a network layer in the neural network model, and the basic operator is represented by a preset programming language; and

the conversion module is further used for operating the neural network model by utilizing resources included in the initialization information;

8. The apparatus of claim 7, wherein the transformation module is further configured to determine, based on a type of the network layer within the neural network model, at least one first base operator corresponding to the type; and

9. The apparatus of claim 8, wherein the conversion module is further configured to select a first base operator from the at least one first base operator that meets a preset requirement by at least one of:

10. The apparatus according to claim 8 or 9, wherein the preset requirements include: the run time is the shortest and/or the occupied memory is the smallest.

11. The apparatus of claim 7, wherein the apparatus further comprises:

the acquisition module is used for acquiring input data;

12. The apparatus of claim 7, wherein the neural network model comprises at least one of a feature extraction neural network model, a classification neural network model, and a target detection neural network model.

13. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to invoke the instructions stored in the memory to perform the method of any of claims 1-6.

14. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method of any of claims 1-6.