CN111596972B

CN111596972B - Neural network model storage method, loading method, device, equipment and storage medium

Info

Publication number: CN111596972B
Application number: CN202010415219.3A
Authority: CN
Inventors: 卢旭辉; 何亮亮; 刘托
Original assignee: Beijing Xiaomi Pinecone Electronic Co Ltd
Current assignee: Beijing Xiaomi Pinecone Electronic Co Ltd
Priority date: 2020-05-15
Filing date: 2020-05-15
Publication date: 2023-09-26
Anticipated expiration: 2040-05-15
Also published as: CN111596972A

Abstract

The application discloses a neural network model storage method, a loading method, a device, equipment and a storage medium, wherein the storage method comprises the following steps: acquiring a source model file of a neural network model; analyzing the data in the source model file to obtain the content data of the model; splicing the content data of the model into an object memory block of a first programming language; packing the object memory block into a byte stream memory block; and storing the byte stream memory block to obtain the target model file. According to the neural network model storage method, a third party library is not required to be integrated, and the requirement on a memory can be reduced.

Description

Neural network model storage method, loading method, device, equipment and storage medium

Technical Field

The present application relates generally to the field of computer technologies, and in particular, to a neural network model storage method, a neural network model loading device, a neural network model storage device, and a neural network model storage medium.

Background

With the great improvement of big data and computing power, the deep learning technology promotes the progress of artificial intelligence. With the richness of artificial intelligence scenes and applications, terminal intelligent devices become an access port for artificial intelligence.

In the prior art, in the deep learning framework, a third party library needs to be integrated when the neural network model is stored, for example, a Protobuf library needs to be integrated in a Protobuf scheme, and for example, a Flatbuffers library needs to be integrated in a Flatbuffers scheme.

The existing neural network model is stored by integrating a third party library, so that the memory requirement is increased.

Disclosure of Invention

In view of the foregoing drawbacks or shortcomings in the prior art, it is desirable to provide a neural network model storage method, loading method, apparatus, device, and storage medium.

In a first aspect, the present application provides a neural network model storage method, including:

acquiring a source model file of a neural network model;

analyzing the data in the source model file to obtain the content data of the model;

splicing the content data of the model into an object memory block of a first programming language;

packing the object memory block into a byte stream memory block;

and storing the byte stream memory block to obtain the target model file.

In one embodiment, stitching content data of a model into an object memory block of a first programming language includes:

sequentially filling the content data of the model into the memory structure, and splicing the content data into an object memory block of the first programming language;

the memory structure is obtained by performing layout according to the layout rule of the objects of the first programming language in the memory.

In one embodiment, the method further comprises:

acquiring a memory byte sequence of a target operating system;

the step of packing the object memory block into a byte stream memory block includes:

and packing the object memory blocks into byte stream memory blocks according to the memory byte sequence of the target operating system.

In one embodiment, the first programming language includes a C language, a C++ language, and an assembly language.

In one embodiment, the content data of the model is spliced into the object memory blocks of the first programming language, and is implemented in the second programming language.

In one embodiment, the object memory blocks of the first programming language include at least one sub-object memory block and pointers corresponding to each sub-object memory block, the pointers corresponding to the sub-object memory blocks pointing to memory addresses of the sub-object memory blocks.

In one embodiment, the object memory blocks of the first programming language include at least two sub-object memory blocks and pointers corresponding to each sub-object memory block, and the order of the sub-object memory blocks is consistent with the order of the pointers corresponding to the sub-object memory blocks.

In a second aspect, the present application provides a neural network model loading method, including:

obtaining a target model file obtained by the storage method;

loading the target model file into a memory in a memory mapping mode to obtain a byte stream memory block;

and converting the byte stream memory block into an object memory block in a type conversion mode.

In a third aspect, the present application provides a neural network model storage device, including:

the first acquisition module is used for acquiring a source model file of the neural network model;

the analysis module is used for analyzing the data in the source model file to obtain the content data of the model;

the splicing module is used for splicing the content data of the model into an object memory block of the first programming language;

the packing module is used for packing the object memory block into a byte stream memory block;

and the storage module is used for storing the byte stream memory blocks to obtain the target model file.

In one embodiment, the stitching module is further configured to:

In one embodiment, the apparatus further comprises: the second acquisition module is used for acquiring the memory byte sequence of the target operating system;

the packaging module is also for: and packing the object memory blocks into byte stream memory blocks according to the memory byte sequence of the target operating system.

In a fourth aspect, the present application provides a neural network model loading device, including:

the third acquisition module is used for acquiring the target model file stored in the storage module in any storage device;

the loading module is used for loading the target model file into the memory in a memory mapping mode to obtain a byte stream memory block;

and the conversion module is used for converting the byte stream memory block into the object memory block in a type conversion mode.

In a fifth aspect, the present application provides an apparatus, including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing any one of the storage methods and implementing the loading method when executing the program.

In a sixth aspect, the present application provides a readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the storage methods described above and implements the loading method described above.

According to the neural network model storage method, the loading device, the equipment and the storage medium, the method is used for splicing the content data of the model obtained by analyzing the model file of the neural network model into the object memory block of the first programming language, then packaging the object memory block into the byte stream memory block and storing the byte stream memory block to obtain the target model file, and a third party library is not required to be integrated, so that the requirement on the memory can be reduced.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:

fig. 1 is a schematic flow chart of a neural network model storage method according to an embodiment of the present application;

FIG. 2 is a schematic layout diagram of a C++ object in memory with the C++ object name NetDef;

FIG. 3 is a schematic flow chart of a neural network model loading method according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a neural network model storage device according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a neural network model loading device according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be noted that, for convenience of description, only the portions related to the application are shown in the drawings.

The following description of at least one exemplary embodiment is merely exemplary in nature and is in no way intended to limit the application, its application, or uses.

Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, the techniques, methods, and apparatus should be considered part of the specification.

In all examples shown and discussed herein, any specific values should be construed as merely illustrative, and not a limitation. Thus, other examples of exemplary embodiments may have different values.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.

It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments. It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present application unless it is specifically stated otherwise.

In this embodiment, a neural network model storing method and a loading method are provided. It should be understood that the storage method and loading method are intended for storage and loading of any neural network model.

The neural network model storage method can be an end-side neural network model storage method. The end side may be a terminal, or may be an IoT (Internet of Things ) device side. It is understood that the terminal may be a mobile terminal such as a smart phone, a tablet computer, a microcontroller, etc.

It will be appreciated that the neural network model storage method is performed by a server or developer PC or the like.

Referring to fig. 1, a flow diagram of a neural network model storage method is shown, according to one embodiment of the application.

As shown in fig. 1, a neural network model storage method may include:

s110, acquiring a source model file of the neural network model.

Specifically, the model file of the neural network model refers to a model file compiled by adopting different neural network frameworks. Model file categories of the neural network model may include categories of image recognition, categories of speech recognition, and categories of text classification. That is, the model file of the neural network model provided by the embodiment of the application can correspond to the neural network algorithm of the categories such as the image recognition category, the voice recognition category or the text classification.

The model file of the neural network model can be obtained through reading by a model conversion tool. For example, the model conversion tool may read the file using a script call system API (Application Program Interface ) written in Python language.

S120, analyzing the data in the model file to obtain the content data of the model.

Specifically, the data in the model file is parsed to obtain the content data of the model, which can be implemented by using the prior art means, and will not be described herein.

S130, splicing the content data of the model into an object memory block of the first programming language.

In particular, a programming language is a formal language for defining a computer program for issuing instructions to a computer. In general, programming languages may include assembly language, high-level languages. The assembly language is a specific language, the instruction adopts an identifier abbreviated by English, the identification and the memorization are easy, and an executable file generated by a source program through the assembly language is smaller and the execution speed is high. The high-level language is not a specific language, but rather is a collection of many programming languages, such as VB, C, C++, etc. By way of example, the first programming language in this embodiment may include a C language, a C++ language, an assembly language, and the like.

The memory block is a section of memory data, and can be a byte stream, an object of a programming language, or other data.

Optionally, the object memory block of the first programming language may include at least one sub-object memory block and a pointer corresponding to each sub-object memory block, where the pointer corresponding to the sub-object memory block points to a storage address of the sub-object memory block. Further, the object memory blocks of the first programming language include at least two sub-object memory blocks and pointers corresponding to each sub-object memory block, and the sequence of the sub-object memory blocks is consistent with the sequence of the pointers corresponding to the sub-object memory blocks.

Optionally, the step S130 of stitching the content data of the model into the object memory block of the first programming language includes:

Specifically, each programming language object has a layout rule in the memory, and the layout of the memory structure of each designer-designed programming language object may be different. In the embodiment of the present application, the object of the programming language takes a c++ object as an example, and as shown in fig. 2, a layout diagram of a c++ object with a c++ object name NetDef in a memory is shown, where the object includes sub-objects such as OperatorDef, argument. The pointer of each sub-object points to the memory address of the corresponding sub-object.

And the layout rules of the objects of different first programming languages in the memory are different, and the memory structure is obtained by performing layout according to the layout rules of the objects of the first programming languages in the memory. And (3) sequentially filling the content data of the model obtained by analyzing in the step (S120) into a memory structure, so that the object memory blocks of the first programming language can be spliced. For example, according to the layout rule of the c++ object in the memory, the content data of the model obtained by parsing is constructed into a c++ object memory block, the c++ object memory block contains the content data of all the models, and each sub-object in the c++ object memory block contains the content data of different models. If the division of the content data of the model is different, the content data corresponding to the child object is also different.

For example, continuing to take the layout of the NetDef object in the memory as shown in fig. 2 as an example, the NetDef object includes content data of a model, where each sub-object in the NetDef object includes different content data, the content data of the model is respectively filled into different sub-objects according to the layout rule of the NetDef object, and finally, the NetDef object memory blocks are spliced.

Alternatively, the stitching of the content data of the model into the object memory block of the first programming language at S130 may be implemented in the second programming language.

Specifically, the second programming language may be the same programming language as the first programming language, or may be a programming language different from the first programming language. The second programming language may be, for example, a python scripting language, or may be a C language, a c++ language, or the like.

S140, packaging the object memory block into a byte stream memory block.

Specifically, the byte stream memory block is memory data with a data type of byte array.

Optionally, the neural network model storage method of the embodiment may further include:

and acquiring the memory byte order of the target operating system.

Specifically, the target operating system may be an end-side operating system, and by way of example, the target operating system may be an operating system of a mobile terminal, such as an operating system of a mobile phone. At present, common mobile phone operating systems include an android operating system, an IOS operating system and the like.

The byte order of the memory refers to the storage order of more than one byte type of data in the memory, and generally has two byte orders, namely a small end order and a large end order. The order of the small end means that low byte data are stored at the low address of the memory, and high byte data are stored at the high address of the memory; the big-end endian is where high-byte data is deposited at low addresses and low-byte data is deposited at high addresses. Typically, the memory byte order of the android operating system and the IOS operating system is a small-end order, while the memory byte order of the operating system, e.g., some embedded platforms, is a large-end order.

The step of packing the object memory block into the byte stream memory block S140 may include:

Specifically, the byte stream memory block is memory data with a data type of byte array, and the object memory block obtained by splicing in the step S130 is packed into the byte stream memory block according to the memory byte sequence of the target operating system, where the byte sequence of the byte stream memory block is consistent with the memory byte sequence of the target operating system.

It should be noted that, the data in the byte stream memory block and the object memory block are both content data of the model.

And S150, storing the byte stream memory block to obtain the target model file.

Specifically, the byte stream memory block may be stored in a disk of the server, or may be stored in a PC (personal computer ) of the developer as a target model file, and the storage location of the byte stream memory block is not limited.

The target model file is a new model file of the neural network model, and the acquisition of the model file does not need to integrate an additional library, so that the requirement on the memory can be reduced.

According to the neural network model storage method, the new model file of the neural network model is stored, a developer packages the stored target model file and the terminal depth frame into an APP (Application program), and the APP can be downloaded by a user as the APP of a terminal Application mall or can be preset to a terminal in advance for the user to use. The APP for the user to download in the application mall may be stored in a server corresponding to the application mall, where the server may be the same server as a server storing the target model file or a developer PC, or may be different from the server storing the target model file, which is not limited thereto.

According to the neural network model storage method, the content data of the model obtained by analyzing the model file of the neural network model are spliced into the object memory block of the first programming language, then the object memory block is packaged into the byte stream memory block and stored, the target model file is obtained, a third party library is not required to be integrated, and the requirement on the memory can be reduced.

When the neural network model stored in the above embodiment is loaded, the loading may be performed by the following embodiment. It is understood that the neural network model loading method is performed by a terminal or the like.

Referring to fig. 3, a flow diagram of a neural network model loading method is shown, according to one embodiment of the application.

As shown in fig. 3, a neural network model loading method may include:

s310, obtaining the target model file obtained in any embodiment;

s320, loading the target model file into a memory in a memory mapping mode to obtain a byte stream memory block;

s330, converting the byte stream memory block into an object memory block in a type conversion mode.

Specifically, the target model file is obtained through the neural network model storage method in the above embodiment. The new model file may be stored in a server or in a developer's PC.

When the neural network model is loaded, the target model file can be obtained through application markets of the terminals, and also can be obtained through APP preset in the terminals.

Memory mapping is a way of reading a file from memory, i.e., a section of a file is read into memory, from a file to a block of memory. And loading the obtained new model file into a memory in a memory mapping mode to obtain a byte stream memory block.

Type conversion is the conversion of data from one type to another, in this embodiment converting a byte stream into objects for program execution, and may be performed using a static_cast forced conversion function in C++, for example.

In the neural network model loading method, the format of the obtained target model file is consistent with the format of the memory object of the neural network model in actual operation, so that the target model file is loaded into the memory without analyzing data, no additional memory is needed in the process of initializing the neural network model, and meanwhile, the computing resource is saved, and the loading speed of the neural network model is higher.

Fig. 4 is a schematic structural diagram of a neural network model storage device 400 according to an embodiment of the present application. As shown in fig. 4, the apparatus may implement the method shown in fig. 1, and the apparatus may include:

a first obtaining module 410, configured to obtain a source model file of the neural network model;

the parsing module 420 is configured to parse the data in the source model file to obtain content data of the model;

a stitching module 430, configured to stitch the content data of the model into an object memory block in the first programming language;

a packing module 440, configured to pack the object memory block into a byte stream memory block;

and the storage module 450 is used for storing the byte stream memory blocks to obtain the target model file.

Optionally, the stitching module 430 is further configured to:

Optionally, the apparatus further comprises: the second acquisition module is used for acquiring the memory byte sequence of the target operating system;

the packaging module 440 is also configured to: and packing the object memory blocks into byte stream memory blocks according to the memory byte sequence of the target operating system.

The first programming language includes a C language, a c++ language, and an assembly language.

Optionally, the content data of the model is spliced into an object memory block of the first programming language, and the object memory block is realized through the second programming language.

Optionally, the object memory block of the first programming language includes at least one sub-object memory block and a pointer corresponding to each sub-object memory block, where the pointer corresponding to the sub-object memory block points to a storage address of the sub-object memory block.

Optionally, the object memory blocks of the first programming language include at least two sub-object memory blocks and pointers corresponding to each sub-object memory block, and an order of the sub-object memory blocks is consistent with an order of the pointers corresponding to the sub-object memory blocks.

The neural network model storage device provided in this embodiment may execute the embodiment of the method, and its implementation principle and technical effects are similar, and will not be described herein.

Fig. 5 is a schematic structural diagram of a neural network model loading device 500 according to an embodiment of the present application. As shown in fig. 5, the apparatus may implement the method shown in fig. 3, and the apparatus may include:

a third obtaining module 510, configured to obtain the object model file stored in the storage module 450 in any one of the above storage devices;

the loading module 520 is configured to load the target model file into the memory in a memory mapping manner, so as to obtain a byte stream memory block;

the conversion module 530 is configured to convert the byte stream memory block into the object memory block by a type conversion method.

The neural network model loading device provided in this embodiment may execute the embodiment of the method, and its implementation principle and technical effects are similar, and will not be described herein.

Fig. 6 is a schematic structural diagram of an apparatus according to an embodiment of the present application. As shown in fig. 6, a schematic diagram of a computer system 600 suitable for use in implementing a terminal device or server of an embodiment of the present application is shown.

As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 606 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 606 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.

In particular, according to embodiments of the present disclosure, the process described above with reference to fig. 1 may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a machine-readable medium, the computer program comprising program code for performing the neural network model storage method and loading method described above. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units or modules involved in the embodiments of the present application may be implemented in software or in hardware. The described units or modules may also be provided in a processor. The names of these units or modules do not in some way constitute a limitation of the unit or module itself.

As another aspect, the present application also provides a computer-readable storage medium, which may be a computer-readable storage medium contained in the foregoing apparatus in the foregoing embodiment; or may be a computer-readable storage medium, alone, that is not assembled into a device. The computer-readable storage medium stores one or more programs for use by one or more processors to perform the storage method or loading method of the neural network model described in the present application.

The above description is only illustrative of the preferred embodiments of the present application and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the application referred to in the present application is not limited to the specific combinations of the technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the inventive concept. Such as the above-mentioned features and the technical features disclosed in the present application (but not limited to) having similar functions are replaced with each other.

Claims

1. A neural network model storage method, comprising:

acquiring a source model file of a neural network model;

splicing the content data of the model into an object memory block of a first programming language, comprising:

sequentially filling the content data of the model into a memory structure, and splicing the content data into an object memory block of the first programming language;

the memory structure is obtained by carrying out layout according to the layout rule of the object of the first programming language in the memory;

packaging the object memory block into a byte stream memory block;

and storing the byte stream memory block to obtain a target model file.

2. The storage method of claim 1, wherein the method further comprises:

acquiring a memory byte sequence of a target operating system;

3. The storage method of claim 1, wherein the first programming language comprises a C language, a c++ language, an assembly language.

4. The method of claim 1, wherein the stitching of the content data of the model into the object memory blocks in the first programming language is performed in the second programming language.

5. The method of claim 1, wherein the object memory blocks of the first programming language include at least one sub-object memory block and a pointer corresponding to each of the sub-object memory blocks, the pointers corresponding to the sub-object memory blocks pointing to memory addresses of the sub-object memory blocks.

6. The method of claim 5, wherein the object memory blocks of the first programming language include at least two sub-object memory blocks and pointers corresponding to each of the sub-object memory blocks, and wherein an order of the sub-object memory blocks is consistent with an order of the pointers corresponding to the sub-object memory blocks.

7. A neural network model loading method, comprising:

obtaining a target model file obtained by the storage method according to any one of claims 1-6;

8. A neural network model storage device, comprising:

the splicing module is used for splicing the content data of the model into an object memory block of a first programming language, and comprises the following steps:

and the storage module is used for storing the byte stream memory blocks to obtain a target model file.

9. The storage device of claim 8, wherein the device further comprises: the second acquisition module is used for acquiring the memory byte sequence of the target operating system;

the packaging module is further configured to: and packing the object memory blocks into byte stream memory blocks according to the memory byte sequence of the target operating system.

10. The storage device of claim 8, wherein the first programming language comprises a C language, a c++ language, an assembly language.

11. The storage device of claim 8, wherein the stitching of the content data of the model into the object memory blocks in the first programming language is performed in the second programming language.

12. The memory device of claim 8, wherein the object memory blocks of the first programming language include at least one sub-object memory block and a pointer corresponding to each of the sub-object memory blocks, the pointer corresponding to the sub-object memory block pointing to a memory address of the sub-object memory block.

13. The storage device of claim 12, wherein the object memory blocks of the first programming language include at least two sub-object memory blocks and pointers corresponding to each of the sub-object memory blocks, the order of the sub-object memory blocks being consistent with the order of the pointers corresponding to the sub-object memory blocks.

14. A neural network model loading device, characterized by comprising:

a third obtaining module, configured to obtain the object model file obtained by storing in the storage module in the storage device according to any one of claims 8 to 13;

the loading module is used for loading the target model file into a memory in a memory mapping mode to obtain a byte stream memory block;

and the conversion module is used for converting the byte stream memory block into an object memory block in a type conversion mode.

15. An apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the storage method of any one of claims 1-6 or the loading method of claim 7 when the program is executed by the processor.

16. A readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements a storage method as claimed in any one of claims 1-6 or a loading method as claimed in claim 7.