CN114579207A - Model file layered loading calculation method of convolutional neural network - Google Patents
Model file layered loading calculation method of convolutional neural network Download PDFInfo
- Publication number
- CN114579207A CN114579207A CN202210304705.7A CN202210304705A CN114579207A CN 114579207 A CN114579207 A CN 114579207A CN 202210304705 A CN202210304705 A CN 202210304705A CN 114579207 A CN114579207 A CN 114579207A
- Authority
- CN
- China
- Prior art keywords
- model file
- loading
- model
- memory
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
- G06F9/44521—Dynamic linking or loading; Link editing at or after load time, e.g. Java class loading
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention relates to a model file layered loading calculation method of a convolutional neural network, S1 loads a model file into an external storage device such as a hard disk or an SD card of an embedded device, and the size of the model file is set to be mw(ii) a S2 records the memory size of the embedded device as maRecording the maximum operation memory of the detection program as mbThen the embedded device can allocate the memory size m for storing the model filecIs mc=(ma‑mb) X α; s3 if mw<=mcDirectly loading the model file into the memory of the embedded device at one time, and turning to S5; s4 if mw>mcModel of lawFiles need to be loaded in multiple times; s5 performs a forward calculation process of the algorithm model. The method utilizes the thought of step-by-step loading to analyze and calculate the model files with different sizes by combining with the memory space of the specific embedded equipment, thereby ensuring that the step-by-step loading of the model files is realized by the least memory access times, considering the requirement of real-time performance and breaking through the limitation of smaller storage space of the embedded equipment.
Description
Technical Field
The invention relates to the field of computer vision, in particular to a model file layered loading calculation method of a convolutional neural network.
Background
With the continuous development of deep learning in the field of target recognition and detection, networks such as VGG, GoogleNet, ResNet and the like have been developed to a deeper network layer number from AlexNet so as to seek better detection accuracy. Related researchers extract deep level features of a detection target by methods of increasing the number of convolution layers, increasing the number of convolution kernels and the like; although the deep network model is superior in many problems, the deep network model is limited in time and space in practical application, the large and deep network model has huge computation amount, even with the help of a graphic processor, the deep network model is difficult to embed and develop on equipment with limited computing resources and storage resources, and the deep network model is difficult to meet many scene requirements in daily life in time; and the high-performance computer has higher production and maintenance cost and is not suitable for popularization and promotion in large quantity. Therefore, in many current applications, especially in application deployment of mobile terminals and embedded systems, such as automatic driving, fatigue detection, robots, and the like, which are limited by integration equipment and processing speed, neural network model files with different sizes from tens of megabytes to hundreds of megabytes cannot be loaded and calculated, model compression research has been carried out, lightweight deep neural networks are continuously proposed, however, compression of models at one step can cause loss of detection accuracy, and simple compression is undoubtedly not preferable.
For the deep neural network, a large number of parameters of the model are concentrated in the convolutional layer, and how to avoid the conflict between the large number of model parameters of the deep neural network and the limited computing and storing capacity of the embedded device makes the deep neural network break through the application limit of the deep neural network model to a certain extent is the first problem of the deep neural network in the development and application of the embedded terminal.
Disclosure of Invention
Aiming at the problems in the prior art, the technical problems to be solved by the invention are as follows: the technical problem that the deep neural network is difficult to embed and develop on equipment with limited computing resources and storage resources is solved.
In order to solve the technical problems, the invention adopts the following technical scheme: a model file layered loading calculation method of a convolutional neural network comprises the following steps:
s1, loading the model file into an external storage device such as a hard disk or an SD card of the embedded device, and assuming that the size of the model file is mw;
S2, recording the memory size of the embedded device as maRecording the maximum operation memory of the detection program as mbThen the embedded device can allocate the memory size m for storing the model filecIs mc=(ma-mb) X α, α is a margin factor, and is set to 0.9;
s3, if mw<=mcDirectly loading the model file into a memory of the embedded device at one time, and turning to S5;
s4, if mw>mcThen the model file needs to be loaded in several times;
and S5, performing a forward calculation process of the convolutional neural network.
As an improvement, the step of fractional loading in S4 is specifically as follows:
In order to ensure that the model file loaded each time is all parameters of N layers, the size of the loaded model file is recorded as mo,moThe initial value is 0; marking L as the L-th convolution layer of the model, wherein the initial value of L is 1;
let n be the nth time (n is the initial value)Is 1) the size of the actually loaded model file is mt,mtThe initial value is the memory size occupied by the Lth convolutional layer parameter; memory mLThe memory size occupied by the Lth convolution layer parameter; the process of actually loading the model file for the nth time is as follows:
ifAnd L is<NoIf L is equal to L +1, update mt=mt+mLRepeating the first step, wherein NoRepresenting the number of model convolution layers;
Fourthly, if mo<mwUpdating L to be L +1, and turning to the first step, namely starting the process of actually loading the model file for the (n + 1) th time;
wu if mo=mwAnd if so, finishing the loading of the whole model file.
As an improvement, the forward calculation process of the convolutional neural network performed in S5 includes the following specific steps: if the last step is S4, starting the forward calculation process of the whole convolutional neural network layer by layer according to the model files loaded in the memory, and after the model files in the memory participate in calculation, turning to S4 to carry out next loading; if the last step is S3, the forward calculation process is performed directly until the end.
Compared with the prior art, the invention has at least the following advantages:
1. the invention uses the thought of step-by-step loading to analyze and calculate the model files with different sizes by combining the memory space of the specific embedded device, thereby ensuring that the step-by-step loading of the model files is realized by the least memory access times, considering the requirement of real-time performance and breaking through the limitation of smaller memory space of the embedded device.
2. The method is characterized in that a forward calculation process of the deep convolutional neural network is carried out by applying a layered calculation idea, and is different from the existing end-to-end neural network operation process with good real-time performance.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention.
Detailed Description
The present invention will be described in further detail below.
The invention provides a model file layered loading calculation method of a convolutional neural network, which is characterized in that a model file is dynamically loaded into a memory of an embedded device according to different storage capacities and different algorithm models of different devices, and is different from the traditional one-time loading method, so that a large and deep convolutional neural network can be embedded and developed on a mobile terminal with limited calculation and storage capacities, the technical problem that the deep neural network is difficult to be embedded and developed on devices with limited calculation resources and storage resources is solved, the problem that the deep convolutional neural network is difficult to fall into an actual application scene is solved to a certain extent, and the method has a good application prospect. The idea of the invention is to dynamically load and compute according to the idea of layering (i.e. each layer is considered as a whole).
The invention provides a model file layered loading calculation method of a convolutional neural network, which dynamically adjusts the process of loading a model file into an internal memory of embedded equipment according to a specific algorithm model and adopts a layered calculation method to carry out the operation process of the algorithm model. Loading the trained model file into the memory of the embedded device by using a dynamic loading method, wherein the convolution kernel parameters of each convolution layer are stored in the model file, the convolution kernels are generally an s × s matrix, the parameter number of a single convolution layer is the product of s × s and the convolution kernel number of the layer, the model file stores the parameters layer by layer one by one according to a one-dimensional array format, and the dynamic loading method comprises the following specific steps:
the model file is a text file used for storing convolution kernel parameters of each convolution layer in the convolutional neural network, wherein the convolution kernel parameters are stored in a floating point number format.
The training process of the convolutional neural network is a process of continuously updating convolutional kernel parameters, the final purpose is to optimize the performance of the network, which is to visually reflect that the loss value is converged, and the convolutional kernel parameters at the moment are stored to obtain a trained model file. The convolutional neural network is trained by the existing method.
Referring to fig. 1, a model file hierarchical loading calculation method of a convolutional neural network includes the following steps:
s1, loading the model file into an external storage device such as a hard disk or an SD card of the embedded device, and assuming that the size of the model file is mw。
S2, recording the memory size of the embedded device as maRecording the maximum operation memory of the detection program as mbThen the embedded device can allocate the memory size m for storing the model filecIs mc=(ma-mb) And x α, α is a margin factor and is set to 0.9.
S3, if mw<=mcThen, the model file is directly loaded into the memory of the embedded device at one time, and the process goes to S5.
S4, if mw>mcThen the model file needs to be loaded in multiple times.
And S5, performing a forward calculation process of the convolutional neural network.
Specifically, the step of sub-loading in S4 is as follows:
according to the forward propagation principle of the convolutional neural network, the input image data passes through the backhaul bo layer by layerAnd (5) carrying out convolution operation on the convolution layer in ne until the convolution layer reaches the detection head, and outputting a result after processing. Therefore, it is required to ensure that each loaded model file is a complete parameter of N layers, where N is between 1 and the number of model convolution layers NoIn the meantime.
Because the forward calculation process of the convolutional neural network is carried out layer by layer, in order to ensure that the model file loaded each time is all parameters of N layers, N is any integer between 1 and the maximum number of convolutional layers, and the size of the loaded model file is recorded as mo,moThe initial value is 0; marking L as the L-th convolution layer of the model, wherein the initial value of L is 1;
let the size of the model file actually loaded n times (n is 1 as the initial value) be mt,mtThe initial value is the memory size occupied by the Lth convolutional layer parameter; memory mLThe memory size occupied by the Lth convolution layer parameter; the process of actually loading the model file for the nth time is as follows:
ifAnd L is<NoIf L is equal to L +1, update mt=mt+mLRepeating the first step, wherein NoRepresenting the number of model convolution layers;
Fourthly, if mo<mwUpdating L to be L +1, and turning to the first step, namely starting the process of actually loading the model file for the (n + 1) th time;
wu if mo=mwAnd if so, finishing the loading of the whole model file.
Specifically, the forward calculation process of the convolutional neural network performed in S5 includes the following steps: if the last step is S4, starting the forward calculation process of the whole convolutional neural network layer by layer according to the model files loaded in the memory, and after the model files in the memory participate in calculation, turning to S4 to carry out next loading; if the last step is S3, the forward calculation process is performed directly until the end.
The invention provides a model file layered loading calculation method of a convolutional neural network, which can apply a deep convolutional neural network to embedded equipment with extremely limited memory resources on the premise of not considering real-time performance, and breaks through the limitation of the traditional neural network based on a high-performance computer; if the real-time performance is high, the algorithm model can be lightened firstly, and then the method provided by the invention is combined, so that the application in the embedded terminal is realized.
Memory: and the memory requirement is used for measuring the memory occupation condition of each convolution layer of the convolution neural network.
Parameters: i.e. the number of parameters, is used to scale a convolutional neural network.
In the following, the convolutional neural network VGG16 is taken as an example to analyze the original memory requirement and parameter, and compare the memory requirement and parameter when the method of the present invention is used. Since the last three layers of VGG16 are fully connected, we only list the first 13 convolutional layers here. Table 1 shows the number of parameters and memory requirements of each convolutional layer of VGG 16.
TABLE 1
As can be seen from table 1, if a conventional convolutional neural network loading method is adopted, that is, the entire model file needs to be loaded into the memory at one time, 56MB of memory size needs to be occupied, which makes migration and development of most embedded devices with small memory very difficult. By adopting the dynamic loading and hierarchical calculation method provided by the invention, the minimum memory requirement is only 12MB, which is reduced by 78% compared with the prior art, and the effect is very obvious.
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.
Claims (3)
1. A model file layered loading calculation method of a convolutional neural network is characterized by comprising the following steps: the method comprises the following steps:
s1, loading the model file into an external storage device such as a hard disk or an SD card of the embedded device, and assuming that the size of the model file is mw;
S2, recording the memory size of the embedded device as maRecording the maximum operation memory of the detection program as mbThen the embedded device can allocate the memory size m for storing the model filecIs mc=(ma-mb) X α, α being a margin factor, typically between 0.7 and 0.9 depending on the computational overhead setting of the device resources;
s3, if mw<=mcDirectly loading the model file into a memory of the embedded device at one time, and turning to S5;
s4, if mw>mcThen the model file needs to be loaded in several times;
and S5, performing a forward calculation process of the convolutional neural network.
2. The model file hierarchical loading computation method of a convolutional neural network as claimed in claim 1, characterized in that: the step of fractional loading in S4 is specifically as follows:
In order to ensure that the model file loaded each time is all parameters of N layers, the size of the loaded model file is recorded as mo,moThe initial value is 0; marking L as the L-th convolution layer of the model, wherein the initial value of L is 1;
let the size of the model file actually loaded n times (n is 1 as the initial value) be mt,mtThe initial value is the memory size occupied by the Lth convolutional layer parameter; memory mLThe memory size occupied by the Lth convolution layer parameter; n th time practiceThe process of loading the model file is as follows:
ifAnd L is<NoIf L is equal to L +1, update mt=mt+mLRepeating the first step, wherein NoRepresenting the number of model convolution layers;
Fourthly, if mo<mwUpdating L to be L +1, and turning to the first step, namely starting the process of actually loading the model file for the (n + 1) th time;
wu if mo=mwAnd if so, finishing the loading of the whole model file.
3. The model file hierarchical loading calculation method of a convolutional neural network as claimed in claim 1 or 2, characterized in that: the forward calculation process of the convolutional neural network performed by the S5 specifically includes the following steps: if the last step is S4, starting the forward calculation process of the whole convolutional neural network layer by layer according to the model files loaded in the memory, and after the model files in the memory participate in calculation, turning to S4 to carry out next loading; if the last step is S3, the forward calculation process is performed directly until the end.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210304705.7A CN114579207B (en) | 2022-03-22 | 2022-03-22 | Model file hierarchical loading calculation method of convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210304705.7A CN114579207B (en) | 2022-03-22 | 2022-03-22 | Model file hierarchical loading calculation method of convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114579207A true CN114579207A (en) | 2022-06-03 |
CN114579207B CN114579207B (en) | 2023-09-01 |
Family
ID=81776296
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210304705.7A Active CN114579207B (en) | 2022-03-22 | 2022-03-22 | Model file hierarchical loading calculation method of convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114579207B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108304802A (en) * | 2018-01-30 | 2018-07-20 | 华中科技大学 | A kind of Quick filter system towards extensive video analysis |
US20200050555A1 (en) * | 2018-08-10 | 2020-02-13 | Lg Electronics Inc. | Optimizing data partitioning and replacement strategy for convolutional neural networks |
CN111242180A (en) * | 2020-01-03 | 2020-06-05 | 南京邮电大学 | Image identification method and system based on lightweight convolutional neural network |
-
2022
- 2022-03-22 CN CN202210304705.7A patent/CN114579207B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108304802A (en) * | 2018-01-30 | 2018-07-20 | 华中科技大学 | A kind of Quick filter system towards extensive video analysis |
US20200050555A1 (en) * | 2018-08-10 | 2020-02-13 | Lg Electronics Inc. | Optimizing data partitioning and replacement strategy for convolutional neural networks |
CN111242180A (en) * | 2020-01-03 | 2020-06-05 | 南京邮电大学 | Image identification method and system based on lightweight convolutional neural network |
Non-Patent Citations (1)
Title |
---|
CAO VU DUNG等: "Autonomous concrete crack detection using deep fully convolutional neural network", 《AUTOMATION IN CONSTRUCTION》, vol. 99, pages 52 - 58, XP085574683, DOI: 10.1016/j.autcon.2018.11.028 * |
Also Published As
Publication number | Publication date |
---|---|
CN114579207B (en) | 2023-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11907760B2 (en) | Systems and methods of memory allocation for neural networks | |
CN110546628B (en) | Minimizing memory reads with directed line buffers to improve neural network environmental performance | |
KR102434729B1 (en) | Processing method and apparatus | |
TW202026858A (en) | Exploiting activation sparsity in deep neural networks | |
CN108573305B (en) | Data processing method, equipment and device | |
US11468316B2 (en) | Cluster compression for compressing weights in neural networks | |
WO2022042123A1 (en) | Image recognition model generation method and apparatus, computer device and storage medium | |
CN111242180B (en) | Image identification method and system based on lightweight convolutional neural network | |
CN112163601B (en) | Image classification method, system, computer device and storage medium | |
US10699190B1 (en) | Systems and methods for efficiently updating neural networks | |
US11487342B2 (en) | Reducing power consumption in a neural network environment using data management | |
CN113705775A (en) | Neural network pruning method, device, equipment and storage medium | |
CN109145107B (en) | Theme extraction method, device, medium and equipment based on convolutional neural network | |
WO2023236319A1 (en) | Convolutional neural network deployment and optimization method for microcontroller | |
US20210326702A1 (en) | Processing device for executing convolutional neural network computation and operation method thereof | |
CN112200310B (en) | Intelligent processor, data processing method and storage medium | |
CN114579207A (en) | Model file layered loading calculation method of convolutional neural network | |
JP2024516514A (en) | Memory mapping of activations for implementing convolutional neural networks | |
CN112784818A (en) | Identification method based on grouping type active learning on optical remote sensing image | |
CN114781634B (en) | Automatic mapping method and device of neural network array based on memristor | |
CN113642724B (en) | CNN accelerator for high bandwidth storage | |
CN113743448B (en) | Model training data acquisition method, model training method and device | |
US20220318634A1 (en) | Method and apparatus for retraining compressed model using variance equalization | |
JP2023024960A (en) | Optimization of memory usage for efficiently executing neural network | |
TW202215300A (en) | Convolutional neural network operation method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |