CN113673684A - Edge end DNN model loading system and method based on input pruning - Google Patents
Edge end DNN model loading system and method based on input pruning Download PDFInfo
- Publication number
- CN113673684A CN113673684A CN202110973801.6A CN202110973801A CN113673684A CN 113673684 A CN113673684 A CN 113673684A CN 202110973801 A CN202110973801 A CN 202110973801A CN 113673684 A CN113673684 A CN 113673684A
- Authority
- CN
- China
- Prior art keywords
- model
- input
- compression
- data
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013138 pruning Methods 0.000 title claims abstract description 44
- 238000000034 method Methods 0.000 title claims abstract description 24
- 238000007906 compression Methods 0.000 claims abstract description 125
- 230000006835 compression Effects 0.000 claims abstract description 125
- 238000007726 management method Methods 0.000 claims abstract description 28
- 238000012549 training Methods 0.000 claims abstract description 26
- 238000013144 data compression Methods 0.000 claims abstract description 7
- 238000004458 analytical method Methods 0.000 claims abstract description 6
- 238000011156 evaluation Methods 0.000 claims description 9
- 238000010606 normalization Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 3
- 238000003064 k means clustering Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 230000002457 bidirectional effect Effects 0.000 abstract description 2
- 238000013528 artificial neural network Methods 0.000 description 5
- 238000003062 neural network model Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 4
- 206010063385 Intellectualisation Diseases 0.000 description 2
- 230000007812 deficiency Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003631 expected effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/1734—Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Computational Mathematics (AREA)
- Quality & Reliability (AREA)
- Algebra (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention discloses an edge DNN model loading system and method based on input pruning, wherein the system comprises a management module and a compression module, wherein the management module comprises a sample management module, a model training module and a model management module; the compression module comprises a compression mode selection module, a data compression module, a model compression and retraining module and a compression log recording module; by adopting the system, a Bayesian network is constructed and trained based on sample data, relevance analysis is carried out on input and output of a model according to the Bayesian network, and a bidirectional list representing the influence degree of input is constructed according to pareto optimality so as to optimize higher time complexity caused by deleting input attributes one by one; and finally, the similarity between the input data attributes is analyzed by combining a clustering algorithm, and the individualized and intelligent compression loading of the model is completed according to the two compression strategies and the derived compression strategies thereof, so that the overall performance of the edge end and the number of the deployed models are improved.
Description
Technical Field
The invention relates to the technical field of network edge terminal intellectualization, in particular to an edge terminal DNN model loading system and method based on input pruning.
Background
With the development of the edge intelligent technology, a large amount of data generated at the edge end of the equipment is not required to be sent to a cloud server for centralized processing, and the data can be processed with high efficiency and low delay through a cloud-edge cooperation mode or an edge-edge cooperation mode, so that the deployment or operation of a deep neural network model at the edge end becomes possible. However, the number of parameters of deep neural network models is extremely large, and some models reach even millions of orders of magnitude. There are several deficiencies in training and testing deep neural network models that contain a large number of parameters:
(1) the constructed model parameters need to be trained to achieve the expected effect, which consumes a large amount of computing resources and needs equipment with strong computing power to improve the training speed and the testing speed.
(2) The persistence of a large number of parameters of the deep neural network also requires the occupation of high-capacity memory or disk resources.
In view of the above disadvantages, training and testing of the neural network at present usually run under a high performance server or cluster, however, the application of the neural network to the edge smart device is limited due to the limited storage and computation resources of the edge smart device itself. Therefore, compressing DNN models efficiently while ensuring computational accuracy to make them suitable for running on edge devices has become a hot issue for edge intelligence field research.
At present, common model compression methods for solving these problems, such as network pruning and the like, mainly develop a small-sized and fast-running neural network by removing values in redundant weight tensors. However, the conventional model compression method needs to manually set the number of deleted parameters, retrain the model, further set the deleted parameters according to the effect of the model, repeat the steps until the expected effect of the model compression is achieved, and finally, the result of the model compression needs to be subjected to a large amount of experimental tests, so that huge workload is brought to developers. And this is not very friendly for some applications since the automation of model compression cannot be achieved by setting the threshold value in advance. In addition, none of the conventional model compression methods can reduce the size of the data, which still puts a significant memory and bandwidth pressure on the model. The model compression of the edge end needs to not only consider the size and capacity of the model, but also reduce the input data quantity as much as possible, and allow the model of the deep neural network to be compressed by processing the data generated by the edge end in a personalized manner.
Therefore, in order to make the model compression technology better applied to the deep neural network model so that the model can be successfully deployed and operated on the edge device with limited resources, a model compression method needs to be further improved, which ensures that not only the size of the compression model can be realized, the data volume of the edge end and the number of the sensors used are reduced, the overall performance of the edge end and the number of the deployment models are improved, but also the cost of data transmission can be further reduced, and the network band block and the resource storage pressure are saved. In addition, currently, no mature model compression system serving the edge end can meet the requirements of personalization and intellectualization.
Disclosure of Invention
In view of the above-mentioned deficiencies of the prior art, the present invention provides an edge-end DNN model loading system and method based on input pruning.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows: an edge-end DNN model loading system based on input pruning comprises a management module and a compression module;
the management module comprises a sample management module, a model training module and a model management module;
the sample management module is used for collecting data to construct a sample and using the sample as input data of a training model;
the sample is data collected by the edge device, and the data is processed and integrated, and the method specifically comprises the following steps: and performing missing value completion and normalization operation on the collected data, storing the data into a file, and then further storing the description information of the data and the file storage path into a database.
The model training module trains the model according to the constructed sample;
further, the model training module records and stores basic information of the model, including evaluation indexes, model size, model name, model storage path, model input attribute and output attribute number information, and provides reference basis for user compression service.
The model management module is used for providing model management for a user, and the user can conveniently check the trained evaluation indexes;
furthermore, the model management module also provides the functions of downloading and uploading the model.
The compression module comprises a compression mode selection module, a data compression module, a model compression and retraining module and a compression log recording module;
the compression mode selection module selects different compression modes according to the relevance among different data;
the compression mode in the compression mode selection module comprises: the compression mode comprises a compression mode based on an input and output association pruning strategy, a compression mode based on an input and output similarity pruning strategy, a compression mode based on an input and output association and then based on an input and output similarity pruning strategy and a compression mode based on an input and output association pruning strategy.
The data compression module generates correspondingly compressed data and stores the data in a set path, meanwhile, data information is stored persistently, and the compressed data is used as input data of the model compression and retraining module;
the model compression and retraining module executes model compression operation according to the compressed data and a compression model threshold value set by a user, constructs a compressed model and executes retraining operation on the compressed model;
the compressed model threshold includes a model size threshold and a model accuracy threshold.
And after the compression log recording module executes the model compression method, the basic information of the model and the updating time, accuracy and recall rate of the model are stored in the compression log for a user to compare and check.
On the other hand, the invention also provides a method for loading a model by adopting the edge-end DNN model loading system based on input pruning, which comprises the following steps:
step 1: performing missing value completion and normalization processing on the data collected by the edge end, storing the data into a file, and further storing the data stored into the file into a sample information table of a database;
step 2: acquiring data in a sample information table, setting relevant parameters of a training model by a user, training the specified model, and storing basic information of the trained model, including evaluation indexes, model size, model name, model storage path, model input attribute and output attribute number information, in a model information table of a database;
and step 3: according to data stored in the sample information table, a Bayesian network is constructed to analyze the relevance between the input and the output of the model, and the input influence degree of each input attribute on the output result is calculated, so that a compression mode based on the relevance pruning strategy between the input and the output is constructed;
the step 3 comprises the following substeps:
step 3.1: designing a Bayesian network structure according to initial model input data in a sample information table, training the Bayesian network, and setting a model size and a model accuracy threshold;
step 3.2: calculating the influence degree of each input attribute of the model on the output according to the trained Bayesian network;
step 3.3: a bi-directional list representing the degree of influence of the input is constructed using a multi-objective optimization algorithm pareto optimality.
And 4, step 4: analyzing the correlation of the input attributes by using a clustering algorithm according to the input influence degree of each input attribute obtained in the step 3 to construct a compression mode based on an inter-input relation pruning strategy;
the step 4 comprises the following substeps:
step 4.1: performing clustering analysis by using a k-means clustering algorithm according to the input influence degree of each input attribute obtained in the step 3, and clustering and sorting the attributes with smaller input influence degree difference and higher similarity;
step 4.2: and on the basis of the step 4.1, compressing the input data according to the set pruning number and proportion.
And 5: arranging and combining the compression modes constructed in the step 3 and the step 4 to obtain four different compression modes, selecting different compression modes to compress the input data according to the relevance among different data, and compressing the model trained in the step 2 according to the compressed input data by using the Kolmogorov theorem;
step 6: retraining the model compressed in the step 5;
and 7: judging whether the model retraining result obtained in the step 6 reaches the threshold of the model size and the model accuracy set by the user, if so, determining that the compression is successful, storing the compressed model and data, and recording a compression log; otherwise, the compression is determined to be failed, and the failure reason and the related parameters are returned to provide parameter basis for the user to re-compress;
and 8: and loading the successfully compressed model to the edge terminal.
Adopt the produced beneficial effect of above-mentioned technical scheme to lie in:
1. in order to enable the deep neural network model to be successfully deployed and efficiently run on the edge device, the invention needs to generate a personalized and intelligent model loading system which can meet the requirements of users and reduce the model compression processing time. The system is mainly applied to the edge end, so that a user can select different model compression modes according to specific scenes and the characteristics of model training data, the workload of model compression developers is reduced while the user is facilitated, a model loading tool which is convenient to operate, high in compression efficiency and obvious in compression effect is finally realized, and the overall performance of the edge end and the number of deployed models are improved.
2. According to the method, relevance analysis is carried out on the input and the output of the model according to the Bayesian network, and the bidirectional list representing the influence degree of the input is constructed according to the pareto optimal, so that higher time complexity caused by deleting the input attributes one by one is optimized.
3. According to the method, relevance analysis is carried out on input and output of a model according to a Bayesian network, and a compression mode based on a relevance pruning strategy between the input and the output is constructed; analyzing the similarity between the input data attributes by adopting a clustering algorithm, and constructing a compression mode based on a relation pruning strategy among the inputs; and the personalized and intelligent compression of the model is completed according to the two compression strategies and the derived compression strategies.
Drawings
Fig. 1 is a schematic structural diagram of an input pruning-based edge-end DNN model loading system provided in an embodiment of the present invention;
fig. 2 is a flowchart of a method for loading a model by using an input pruning based edge DNN model loading system in the embodiment of the present invention.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
As shown in fig. 1, the edge-end DNN model loading system based on input pruning in this embodiment is as follows:
the system comprises a management module and a compression module;
the management module comprises a sample management module, a model training module and a model management module;
the sample management module is used for collecting data to construct a sample and using the sample as input data of a training model;
the sample is data collected by the edge device, and the data is processed and integrated, and the method specifically comprises the following steps: and performing missing value completion and normalization operation on the collected data, storing the data into a file, and then further storing the description information of the data and the file storage path into a database.
The model training module trains the model according to the constructed sample;
further, the model training module records and stores basic information of the model, including evaluation indexes, model size, model name, model storage path, model input attribute and output attribute number information, and provides reference basis for user compression service.
The model management module is used for providing model management for a user, and the user can conveniently check the trained evaluation indexes;
furthermore, the model management module also provides the functions of downloading and uploading the model.
The compression module comprises a compression mode selection module, a data compression module, a model compression and retraining module and a compression log recording module;
the compression mode selection module selects different compression modes according to the relevance among different data;
the compression mode in the compression mode selection module comprises: the compression mode comprises a compression mode based on an input and output association pruning strategy, a compression mode based on an input and output similarity pruning strategy, a compression mode based on an input and output association and then based on an input and output similarity pruning strategy and a compression mode based on an input and output association pruning strategy.
The data compression module generates correspondingly compressed data and stores the data in a set path, meanwhile, data information is stored persistently, and the compressed data is used as input data of the model compression and retraining module;
the model compression and retraining module executes model compression operation according to the compressed data and a compression model threshold value set by a user, constructs a compressed model and executes retraining operation on the compressed model;
the compressed model threshold includes a model size threshold and a model accuracy threshold.
And after the compression log recording module executes the model compression method, the basic information of the model and the updating time, accuracy and recall rate of the model are stored in the compression log for a user to compare and check.
On the other hand, the embodiment also provides a method for loading a model by using the above input pruning based edge DNN model loading system, and a flow of the method is shown in fig. 2, and includes the following steps:
step 1: performing missing value completion and normalization processing on the data collected by the edge end, storing the data into a file, and further storing the data stored into the file into a sample information table of a database;
step 2: acquiring data in a sample information table, setting relevant parameters of a training model by a user, training the specified model, and storing basic information of the trained model, including evaluation indexes, model size, model name, model storage path, model input attribute and output attribute number information, in a model information table of a database;
and step 3: according to data stored in the sample information table, a Bayesian network is constructed to analyze the relevance between the input and the output of the model, and the input influence degree of each input attribute on the output result is calculated, so that a compression mode based on the relevance pruning strategy between the input and the output is constructed;
the step 3 comprises the following substeps:
step 3.1: designing a Bayesian network structure according to initial model input data in a sample information table, training the Bayesian network, and setting a model size and a model accuracy threshold;
step 3.2: calculating the influence degree of each input attribute of the model on the output according to the trained Bayesian network;
step 3.3: a bi-directional list representing the degree of influence of the input is constructed using a multi-objective optimization algorithm pareto optimality.
In this embodiment, the input attribute needs to be pruned according to the input influence degree list, and the input attribute with weak relevance to the output result is deleted, which corresponds to the deletion operation of a certain node of the list.
And 4, step 4: analyzing the correlation of the input attributes by using a clustering algorithm according to the input influence degree of each input attribute obtained in the step 3 to construct a compression mode based on an inter-input relation pruning strategy;
the step 4 comprises the following substeps:
step 4.1: performing clustering analysis by using a k-means clustering algorithm according to the input influence degree of each input attribute obtained in the step 3, and clustering and sorting the attributes with smaller input influence degree difference and higher similarity;
step 4.2: and on the basis of the step 4.1, compressing the input data according to the set pruning number and proportion.
And 5: arranging and combining the compression modes constructed in the step 3 and the step 4 to obtain four different compression modes, selecting different compression modes to compress the input data according to the relevance among different data, and compressing the model trained in the step 2 according to the compressed input data by using the Kolmogorov theorem;
step 6: retraining the model compressed in the step 5;
and 7: judging whether the model retraining result obtained in the step 6 reaches the threshold of the model size and the model accuracy set by the user, if so, determining that the compression is successful, storing the compressed model and data, and recording a compression log; otherwise, the compression is determined to be failed, and the failure reason and the related parameters are returned to provide parameter basis for the user to re-compress;
and 8: and loading the successfully compressed model to the edge terminal.
In this embodiment, the system requires the user to select the model and the corresponding data for performing the compression operation, set the compression mode to be used, fill in the necessary compression parameters, and automatically perform the compression loading operation of the model. The user can correspondingly adjust the compression strategy of the model according to the requirement of the user or the feedback of the compression loading operation.
Claims (9)
1. An edge-end DNN model loading system based on input pruning is characterized in that: the system comprises a management module and a compression module;
the management module comprises a sample management module, a model training module and a model management module;
the sample management module is used for collecting data to construct a sample and using the sample as input data of a training model;
the model training module trains the model according to the constructed sample;
the model management module is used for providing model management for a user, and the user can conveniently check the trained evaluation indexes;
the compression module comprises a compression mode selection module, a data compression module, a model compression and retraining module and a compression log recording module;
the compression mode selection module selects different compression modes according to the relevance among different data;
the data compression module generates correspondingly compressed data and stores the data in a set path, meanwhile, data information is stored persistently, and the compressed data is used as input data of the model compression and retraining module;
the model compression and retraining module executes model compression operation according to the compressed data and a compression model threshold value set by a user, constructs a compressed model and executes retraining operation on the compressed model;
and after the compression log recording module executes the model compression method, the basic information of the model and the updating time, accuracy and recall rate of the model are stored in the compression log for a user to compare and check.
2. The input pruning-based edge-end DNN model loading system of claim 1, wherein: the sample is data collected by the edge device, and the data is processed and integrated, and the method specifically comprises the following steps: and performing missing value completion and normalization operation on the collected data, storing the data into a file, and then further storing the description information of the data and the file storage path into a database.
3. The input pruning-based edge-end DNN model loading system of claim 1, wherein: the model training module is also used for recording and storing basic information of the model, including evaluation indexes, model size, model name, model storage path, model input attribute and output attribute number information, and providing reference basis for user compression service.
4. The input pruning-based edge-end DNN model loading system of claim 1, wherein: the model management module also provides the functions of downloading and uploading the model.
5. The input pruning-based edge-end DNN model loading system of claim 1, wherein: the compression mode in the compression mode selection module comprises: the compression mode comprises a compression mode based on an input and output association pruning strategy, a compression mode based on an input and output similarity pruning strategy, a compression mode based on an input and output association and then based on an input and output similarity pruning strategy and a compression mode based on an input and output association pruning strategy.
6. The input pruning-based edge-end DNN model loading system of claim 1, wherein: the compressed model threshold includes a model size threshold and a model accuracy threshold.
7. The method for model loading by using the input pruning based edge-end DNN model loading system of any one of the preceding claims 1 to 6, characterized by comprising the following steps:
step 1: performing missing value completion and normalization processing on the data collected by the edge end, storing the data into a file, and further storing the data stored into the file into a sample information table of a database;
step 2: acquiring data in a sample information table, setting relevant parameters of a training model by a user, training the specified model, and storing basic information of the trained model, including evaluation indexes, model size, model name, model storage path, model input attribute and output attribute number information, in a model information table of a database;
and step 3: according to data stored in the sample information table, a Bayesian network is constructed to analyze the relevance between the input and the output of the model, and the input influence degree of each input attribute on the output result is calculated, so that a compression mode based on the relevance pruning strategy between the input and the output is constructed;
and 4, step 4: analyzing the correlation of the input attributes by using a clustering algorithm according to the input influence degree of each input attribute obtained in the step 3 to construct a compression mode based on an inter-input relation pruning strategy;
and 5: arranging and combining the compression modes constructed in the step 3 and the step 4 to obtain four different compression modes, selecting different compression modes to compress the input data according to the relevance among different data, and compressing the model trained in the step 2 according to the compressed input data by using the Kolmogorov theorem;
step 6: retraining the model compressed in the step 5;
and 7: judging whether the model retraining result obtained in the step 6 reaches the threshold of the model size and the model accuracy set by the user, if so, determining that the compression is successful, storing the compressed model and data, and recording a compression log; otherwise, the compression is determined to be failed, and the failure reason and the related parameters are returned to provide parameter basis for the user to re-compress;
and 8: and loading the successfully compressed model to the edge terminal.
8. The method for model loading using an input pruning based edge-end DNN model loading system according to claim 7, wherein the step 3 comprises the sub-steps of:
step 3.1: designing a Bayesian network structure according to initial model input data in a sample information table, training the Bayesian network, and setting a model size and a model accuracy threshold;
step 3.2: calculating the influence degree of each input attribute of the model on the output according to the trained Bayesian network;
step 3.3: a bi-directional list representing the degree of influence of the input is constructed using a multi-objective optimization algorithm pareto optimality.
9. The method for model loading using an input pruning based edge-end DNN model loading system according to claim 7, wherein the step 4 comprises the sub-steps of:
step 4.1: performing clustering analysis by using a k-means clustering algorithm according to the input influence degree of each input attribute obtained in the step 3, and clustering and sorting the attributes with smaller input influence degree difference and higher similarity;
step 4.2: and on the basis of the step 4.1, compressing the input data according to the set pruning number and proportion.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110973801.6A CN113673684A (en) | 2021-08-24 | 2021-08-24 | Edge end DNN model loading system and method based on input pruning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110973801.6A CN113673684A (en) | 2021-08-24 | 2021-08-24 | Edge end DNN model loading system and method based on input pruning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113673684A true CN113673684A (en) | 2021-11-19 |
Family
ID=78545571
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110973801.6A Pending CN113673684A (en) | 2021-08-24 | 2021-08-24 | Edge end DNN model loading system and method based on input pruning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113673684A (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102955857A (en) * | 2012-11-09 | 2013-03-06 | 北京航空航天大学 | Class center compression transformation-based text clustering method in search engine |
CN111144561A (en) * | 2018-11-05 | 2020-05-12 | 杭州海康威视数字技术股份有限公司 | Neural network model determining method and device |
CN111709522A (en) * | 2020-05-21 | 2020-09-25 | 哈尔滨工业大学 | Deep learning target detection system based on server-embedded cooperation |
CN112132005A (en) * | 2020-09-21 | 2020-12-25 | 福州大学 | Face detection method based on cluster analysis and model compression |
CN112184391A (en) * | 2020-10-16 | 2021-01-05 | 中国科学院计算技术研究所 | Recommendation model training method, medium, electronic device and recommendation model |
CN112200104A (en) * | 2020-10-15 | 2021-01-08 | 重庆科技学院 | Chemical engineering fault diagnosis method based on novel Bayesian framework for enhanced principal component analysis |
US20210081789A1 (en) * | 2019-09-13 | 2021-03-18 | Latent AI, Inc. | Optimizing execution of a neural network based on operational performance parameters |
CN112685139A (en) * | 2021-01-11 | 2021-04-20 | 东北大学 | K8S and Kubeedge-based cloud edge deep learning model management system and model training method |
CN112906294A (en) * | 2021-01-28 | 2021-06-04 | 三星(中国)半导体有限公司 | Quantization method and quantization device for deep learning model |
CN112948532A (en) * | 2021-04-08 | 2021-06-11 | 河南高通物联网有限公司 | Chain table data compression strategy selection method and system based on industrial big data analysis |
-
2021
- 2021-08-24 CN CN202110973801.6A patent/CN113673684A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102955857A (en) * | 2012-11-09 | 2013-03-06 | 北京航空航天大学 | Class center compression transformation-based text clustering method in search engine |
CN111144561A (en) * | 2018-11-05 | 2020-05-12 | 杭州海康威视数字技术股份有限公司 | Neural network model determining method and device |
US20210081789A1 (en) * | 2019-09-13 | 2021-03-18 | Latent AI, Inc. | Optimizing execution of a neural network based on operational performance parameters |
CN111709522A (en) * | 2020-05-21 | 2020-09-25 | 哈尔滨工业大学 | Deep learning target detection system based on server-embedded cooperation |
CN112132005A (en) * | 2020-09-21 | 2020-12-25 | 福州大学 | Face detection method based on cluster analysis and model compression |
CN112200104A (en) * | 2020-10-15 | 2021-01-08 | 重庆科技学院 | Chemical engineering fault diagnosis method based on novel Bayesian framework for enhanced principal component analysis |
CN112184391A (en) * | 2020-10-16 | 2021-01-05 | 中国科学院计算技术研究所 | Recommendation model training method, medium, electronic device and recommendation model |
CN112685139A (en) * | 2021-01-11 | 2021-04-20 | 东北大学 | K8S and Kubeedge-based cloud edge deep learning model management system and model training method |
CN112906294A (en) * | 2021-01-28 | 2021-06-04 | 三星(中国)半导体有限公司 | Quantization method and quantization device for deep learning model |
CN112948532A (en) * | 2021-04-08 | 2021-06-11 | 河南高通物联网有限公司 | Chain table data compression strategy selection method and system based on industrial big data analysis |
Non-Patent Citations (4)
Title |
---|
VAHIDEH HAYYOLALAM等: "Edge Intelligence for Empowering IoT-Based Healthcare Systems", 《IEEE WIRELESS COMMUNICATIONS》, vol. 28, no. 3, 30 June 2021 (2021-06-30), pages 6 - 14, XP011867282, DOI: 10.1109/MWC.001.2000345 * |
王振玺等: "列存储数据区级压缩模式与压缩策略选择方法", 《计算机学报》, vol. 33, no. 8, 31 August 2010 (2010-08-31), pages 1523 - 1530 * |
赖叶静等: "深度神经网络模型压缩方法与进展", 《华东师范大学学报 (自然科学版)》, no. 5, 30 September 2020 (2020-09-30), pages 68 - 82 * |
赵海丰: "关联规则挖掘及贝叶斯网表示研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 2007, 31 May 2007 (2007-05-31), pages 140 - 39 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111444236B (en) | Mobile terminal user portrait construction method and system based on big data | |
CN111177178B (en) | Data processing method and related equipment | |
WO2019015631A1 (en) | Method for generating combined features for machine learning samples and system | |
CN109961147B (en) | Automatic model compression method based on Q-Learning algorithm | |
CN110109899B (en) | Internet of things data filling method, device and system | |
KR20200010624A (en) | Big Data Integrated Diagnosis Prediction System Using Machine Learning | |
CN112699605A (en) | Charging pile fault element prediction method and system | |
CN104077328A (en) | Operation diagnosis method and device for MapReduce distributed system | |
CN104391879A (en) | Method and device for hierarchical clustering | |
CN116821646A (en) | Data processing chain construction method, data reduction method, device, equipment and medium | |
Vychuzhanin et al. | Analysis and structuring diagnostic large volume data of technical condition of complex equipment in transport | |
CN108073582B (en) | Computing framework selection method and device | |
CN116628451B (en) | High-speed analysis method for information to be processed | |
CN114884515A (en) | System, method and apparatus for adaptive data compression in a time series database | |
CN117318033B (en) | Power grid data management method and system combining data twinning | |
CN114510526A (en) | Online numerical control exhibition method | |
CN113673684A (en) | Edge end DNN model loading system and method based on input pruning | |
CN113360353A (en) | Test server and cloud platform | |
CN108280224B (en) | Ten thousand grades of dimension data generation methods, device, equipment and storage medium | |
CN106227857B (en) | Data-pushing and loading method and device | |
CN110555008B (en) | Generator comprehensive diagnosis and state evaluation cloud platform | |
US11797577B2 (en) | Smart data warehouse for cloud-based reservoir simulation | |
CN115374212A (en) | Computing method based on artificial intelligence data processing | |
Lei et al. | Application of distributed machine learning model in fault diagnosis of air preheater | |
CN111861404B (en) | Data processing method and device based on intelligent machine and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |