CN113673684A - Edge end DNN model loading system and method based on input pruning - Google Patents

Edge end DNN model loading system and method based on input pruning Download PDF

Info

Publication number
CN113673684A
CN113673684A CN202110973801.6A CN202110973801A CN113673684A CN 113673684 A CN113673684 A CN 113673684A CN 202110973801 A CN202110973801 A CN 202110973801A CN 113673684 A CN113673684 A CN 113673684A
Authority
CN
China
Prior art keywords
model
input
compression
data
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110973801.6A
Other languages
Chinese (zh)
Inventor
连佳欣
那俊
张瀚铎
张斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN202110973801.6A priority Critical patent/CN113673684A/en
Publication of CN113673684A publication Critical patent/CN113673684A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Algebra (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses an edge DNN model loading system and method based on input pruning, wherein the system comprises a management module and a compression module, wherein the management module comprises a sample management module, a model training module and a model management module; the compression module comprises a compression mode selection module, a data compression module, a model compression and retraining module and a compression log recording module; by adopting the system, a Bayesian network is constructed and trained based on sample data, relevance analysis is carried out on input and output of a model according to the Bayesian network, and a bidirectional list representing the influence degree of input is constructed according to pareto optimality so as to optimize higher time complexity caused by deleting input attributes one by one; and finally, the similarity between the input data attributes is analyzed by combining a clustering algorithm, and the individualized and intelligent compression loading of the model is completed according to the two compression strategies and the derived compression strategies thereof, so that the overall performance of the edge end and the number of the deployed models are improved.

Description

Edge end DNN model loading system and method based on input pruning
Technical Field
The invention relates to the technical field of network edge terminal intellectualization, in particular to an edge terminal DNN model loading system and method based on input pruning.
Background
With the development of the edge intelligent technology, a large amount of data generated at the edge end of the equipment is not required to be sent to a cloud server for centralized processing, and the data can be processed with high efficiency and low delay through a cloud-edge cooperation mode or an edge-edge cooperation mode, so that the deployment or operation of a deep neural network model at the edge end becomes possible. However, the number of parameters of deep neural network models is extremely large, and some models reach even millions of orders of magnitude. There are several deficiencies in training and testing deep neural network models that contain a large number of parameters:
(1) the constructed model parameters need to be trained to achieve the expected effect, which consumes a large amount of computing resources and needs equipment with strong computing power to improve the training speed and the testing speed.
(2) The persistence of a large number of parameters of the deep neural network also requires the occupation of high-capacity memory or disk resources.
In view of the above disadvantages, training and testing of the neural network at present usually run under a high performance server or cluster, however, the application of the neural network to the edge smart device is limited due to the limited storage and computation resources of the edge smart device itself. Therefore, compressing DNN models efficiently while ensuring computational accuracy to make them suitable for running on edge devices has become a hot issue for edge intelligence field research.
At present, common model compression methods for solving these problems, such as network pruning and the like, mainly develop a small-sized and fast-running neural network by removing values in redundant weight tensors. However, the conventional model compression method needs to manually set the number of deleted parameters, retrain the model, further set the deleted parameters according to the effect of the model, repeat the steps until the expected effect of the model compression is achieved, and finally, the result of the model compression needs to be subjected to a large amount of experimental tests, so that huge workload is brought to developers. And this is not very friendly for some applications since the automation of model compression cannot be achieved by setting the threshold value in advance. In addition, none of the conventional model compression methods can reduce the size of the data, which still puts a significant memory and bandwidth pressure on the model. The model compression of the edge end needs to not only consider the size and capacity of the model, but also reduce the input data quantity as much as possible, and allow the model of the deep neural network to be compressed by processing the data generated by the edge end in a personalized manner.
Therefore, in order to make the model compression technology better applied to the deep neural network model so that the model can be successfully deployed and operated on the edge device with limited resources, a model compression method needs to be further improved, which ensures that not only the size of the compression model can be realized, the data volume of the edge end and the number of the sensors used are reduced, the overall performance of the edge end and the number of the deployment models are improved, but also the cost of data transmission can be further reduced, and the network band block and the resource storage pressure are saved. In addition, currently, no mature model compression system serving the edge end can meet the requirements of personalization and intellectualization.
Disclosure of Invention
In view of the above-mentioned deficiencies of the prior art, the present invention provides an edge-end DNN model loading system and method based on input pruning.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows: an edge-end DNN model loading system based on input pruning comprises a management module and a compression module;
the management module comprises a sample management module, a model training module and a model management module;
the sample management module is used for collecting data to construct a sample and using the sample as input data of a training model;
the sample is data collected by the edge device, and the data is processed and integrated, and the method specifically comprises the following steps: and performing missing value completion and normalization operation on the collected data, storing the data into a file, and then further storing the description information of the data and the file storage path into a database.
The model training module trains the model according to the constructed sample;
further, the model training module records and stores basic information of the model, including evaluation indexes, model size, model name, model storage path, model input attribute and output attribute number information, and provides reference basis for user compression service.
The model management module is used for providing model management for a user, and the user can conveniently check the trained evaluation indexes;
furthermore, the model management module also provides the functions of downloading and uploading the model.
The compression module comprises a compression mode selection module, a data compression module, a model compression and retraining module and a compression log recording module;
the compression mode selection module selects different compression modes according to the relevance among different data;
the compression mode in the compression mode selection module comprises: the compression mode comprises a compression mode based on an input and output association pruning strategy, a compression mode based on an input and output similarity pruning strategy, a compression mode based on an input and output association and then based on an input and output similarity pruning strategy and a compression mode based on an input and output association pruning strategy.
The data compression module generates correspondingly compressed data and stores the data in a set path, meanwhile, data information is stored persistently, and the compressed data is used as input data of the model compression and retraining module;
the model compression and retraining module executes model compression operation according to the compressed data and a compression model threshold value set by a user, constructs a compressed model and executes retraining operation on the compressed model;
the compressed model threshold includes a model size threshold and a model accuracy threshold.
And after the compression log recording module executes the model compression method, the basic information of the model and the updating time, accuracy and recall rate of the model are stored in the compression log for a user to compare and check.
On the other hand, the invention also provides a method for loading a model by adopting the edge-end DNN model loading system based on input pruning, which comprises the following steps:
step 1: performing missing value completion and normalization processing on the data collected by the edge end, storing the data into a file, and further storing the data stored into the file into a sample information table of a database;
step 2: acquiring data in a sample information table, setting relevant parameters of a training model by a user, training the specified model, and storing basic information of the trained model, including evaluation indexes, model size, model name, model storage path, model input attribute and output attribute number information, in a model information table of a database;
and step 3: according to data stored in the sample information table, a Bayesian network is constructed to analyze the relevance between the input and the output of the model, and the input influence degree of each input attribute on the output result is calculated, so that a compression mode based on the relevance pruning strategy between the input and the output is constructed;
the step 3 comprises the following substeps:
step 3.1: designing a Bayesian network structure according to initial model input data in a sample information table, training the Bayesian network, and setting a model size and a model accuracy threshold;
step 3.2: calculating the influence degree of each input attribute of the model on the output according to the trained Bayesian network;
step 3.3: a bi-directional list representing the degree of influence of the input is constructed using a multi-objective optimization algorithm pareto optimality.
And 4, step 4: analyzing the correlation of the input attributes by using a clustering algorithm according to the input influence degree of each input attribute obtained in the step 3 to construct a compression mode based on an inter-input relation pruning strategy;
the step 4 comprises the following substeps:
step 4.1: performing clustering analysis by using a k-means clustering algorithm according to the input influence degree of each input attribute obtained in the step 3, and clustering and sorting the attributes with smaller input influence degree difference and higher similarity;
step 4.2: and on the basis of the step 4.1, compressing the input data according to the set pruning number and proportion.
And 5: arranging and combining the compression modes constructed in the step 3 and the step 4 to obtain four different compression modes, selecting different compression modes to compress the input data according to the relevance among different data, and compressing the model trained in the step 2 according to the compressed input data by using the Kolmogorov theorem;
step 6: retraining the model compressed in the step 5;
and 7: judging whether the model retraining result obtained in the step 6 reaches the threshold of the model size and the model accuracy set by the user, if so, determining that the compression is successful, storing the compressed model and data, and recording a compression log; otherwise, the compression is determined to be failed, and the failure reason and the related parameters are returned to provide parameter basis for the user to re-compress;
and 8: and loading the successfully compressed model to the edge terminal.
Adopt the produced beneficial effect of above-mentioned technical scheme to lie in:
1. in order to enable the deep neural network model to be successfully deployed and efficiently run on the edge device, the invention needs to generate a personalized and intelligent model loading system which can meet the requirements of users and reduce the model compression processing time. The system is mainly applied to the edge end, so that a user can select different model compression modes according to specific scenes and the characteristics of model training data, the workload of model compression developers is reduced while the user is facilitated, a model loading tool which is convenient to operate, high in compression efficiency and obvious in compression effect is finally realized, and the overall performance of the edge end and the number of deployed models are improved.
2. According to the method, relevance analysis is carried out on the input and the output of the model according to the Bayesian network, and the bidirectional list representing the influence degree of the input is constructed according to the pareto optimal, so that higher time complexity caused by deleting the input attributes one by one is optimized.
3. According to the method, relevance analysis is carried out on input and output of a model according to a Bayesian network, and a compression mode based on a relevance pruning strategy between the input and the output is constructed; analyzing the similarity between the input data attributes by adopting a clustering algorithm, and constructing a compression mode based on a relation pruning strategy among the inputs; and the personalized and intelligent compression of the model is completed according to the two compression strategies and the derived compression strategies.
Drawings
Fig. 1 is a schematic structural diagram of an input pruning-based edge-end DNN model loading system provided in an embodiment of the present invention;
fig. 2 is a flowchart of a method for loading a model by using an input pruning based edge DNN model loading system in the embodiment of the present invention.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
As shown in fig. 1, the edge-end DNN model loading system based on input pruning in this embodiment is as follows:
the system comprises a management module and a compression module;
the management module comprises a sample management module, a model training module and a model management module;
the sample management module is used for collecting data to construct a sample and using the sample as input data of a training model;
the sample is data collected by the edge device, and the data is processed and integrated, and the method specifically comprises the following steps: and performing missing value completion and normalization operation on the collected data, storing the data into a file, and then further storing the description information of the data and the file storage path into a database.
The model training module trains the model according to the constructed sample;
further, the model training module records and stores basic information of the model, including evaluation indexes, model size, model name, model storage path, model input attribute and output attribute number information, and provides reference basis for user compression service.
The model management module is used for providing model management for a user, and the user can conveniently check the trained evaluation indexes;
furthermore, the model management module also provides the functions of downloading and uploading the model.
The compression module comprises a compression mode selection module, a data compression module, a model compression and retraining module and a compression log recording module;
the compression mode selection module selects different compression modes according to the relevance among different data;
the compression mode in the compression mode selection module comprises: the compression mode comprises a compression mode based on an input and output association pruning strategy, a compression mode based on an input and output similarity pruning strategy, a compression mode based on an input and output association and then based on an input and output similarity pruning strategy and a compression mode based on an input and output association pruning strategy.
The data compression module generates correspondingly compressed data and stores the data in a set path, meanwhile, data information is stored persistently, and the compressed data is used as input data of the model compression and retraining module;
the model compression and retraining module executes model compression operation according to the compressed data and a compression model threshold value set by a user, constructs a compressed model and executes retraining operation on the compressed model;
the compressed model threshold includes a model size threshold and a model accuracy threshold.
And after the compression log recording module executes the model compression method, the basic information of the model and the updating time, accuracy and recall rate of the model are stored in the compression log for a user to compare and check.
On the other hand, the embodiment also provides a method for loading a model by using the above input pruning based edge DNN model loading system, and a flow of the method is shown in fig. 2, and includes the following steps:
step 1: performing missing value completion and normalization processing on the data collected by the edge end, storing the data into a file, and further storing the data stored into the file into a sample information table of a database;
step 2: acquiring data in a sample information table, setting relevant parameters of a training model by a user, training the specified model, and storing basic information of the trained model, including evaluation indexes, model size, model name, model storage path, model input attribute and output attribute number information, in a model information table of a database;
and step 3: according to data stored in the sample information table, a Bayesian network is constructed to analyze the relevance between the input and the output of the model, and the input influence degree of each input attribute on the output result is calculated, so that a compression mode based on the relevance pruning strategy between the input and the output is constructed;
the step 3 comprises the following substeps:
step 3.1: designing a Bayesian network structure according to initial model input data in a sample information table, training the Bayesian network, and setting a model size and a model accuracy threshold;
step 3.2: calculating the influence degree of each input attribute of the model on the output according to the trained Bayesian network;
step 3.3: a bi-directional list representing the degree of influence of the input is constructed using a multi-objective optimization algorithm pareto optimality.
In this embodiment, the input attribute needs to be pruned according to the input influence degree list, and the input attribute with weak relevance to the output result is deleted, which corresponds to the deletion operation of a certain node of the list.
And 4, step 4: analyzing the correlation of the input attributes by using a clustering algorithm according to the input influence degree of each input attribute obtained in the step 3 to construct a compression mode based on an inter-input relation pruning strategy;
the step 4 comprises the following substeps:
step 4.1: performing clustering analysis by using a k-means clustering algorithm according to the input influence degree of each input attribute obtained in the step 3, and clustering and sorting the attributes with smaller input influence degree difference and higher similarity;
step 4.2: and on the basis of the step 4.1, compressing the input data according to the set pruning number and proportion.
And 5: arranging and combining the compression modes constructed in the step 3 and the step 4 to obtain four different compression modes, selecting different compression modes to compress the input data according to the relevance among different data, and compressing the model trained in the step 2 according to the compressed input data by using the Kolmogorov theorem;
step 6: retraining the model compressed in the step 5;
and 7: judging whether the model retraining result obtained in the step 6 reaches the threshold of the model size and the model accuracy set by the user, if so, determining that the compression is successful, storing the compressed model and data, and recording a compression log; otherwise, the compression is determined to be failed, and the failure reason and the related parameters are returned to provide parameter basis for the user to re-compress;
and 8: and loading the successfully compressed model to the edge terminal.
In this embodiment, the system requires the user to select the model and the corresponding data for performing the compression operation, set the compression mode to be used, fill in the necessary compression parameters, and automatically perform the compression loading operation of the model. The user can correspondingly adjust the compression strategy of the model according to the requirement of the user or the feedback of the compression loading operation.

Claims (9)

1. An edge-end DNN model loading system based on input pruning is characterized in that: the system comprises a management module and a compression module;
the management module comprises a sample management module, a model training module and a model management module;
the sample management module is used for collecting data to construct a sample and using the sample as input data of a training model;
the model training module trains the model according to the constructed sample;
the model management module is used for providing model management for a user, and the user can conveniently check the trained evaluation indexes;
the compression module comprises a compression mode selection module, a data compression module, a model compression and retraining module and a compression log recording module;
the compression mode selection module selects different compression modes according to the relevance among different data;
the data compression module generates correspondingly compressed data and stores the data in a set path, meanwhile, data information is stored persistently, and the compressed data is used as input data of the model compression and retraining module;
the model compression and retraining module executes model compression operation according to the compressed data and a compression model threshold value set by a user, constructs a compressed model and executes retraining operation on the compressed model;
and after the compression log recording module executes the model compression method, the basic information of the model and the updating time, accuracy and recall rate of the model are stored in the compression log for a user to compare and check.
2. The input pruning-based edge-end DNN model loading system of claim 1, wherein: the sample is data collected by the edge device, and the data is processed and integrated, and the method specifically comprises the following steps: and performing missing value completion and normalization operation on the collected data, storing the data into a file, and then further storing the description information of the data and the file storage path into a database.
3. The input pruning-based edge-end DNN model loading system of claim 1, wherein: the model training module is also used for recording and storing basic information of the model, including evaluation indexes, model size, model name, model storage path, model input attribute and output attribute number information, and providing reference basis for user compression service.
4. The input pruning-based edge-end DNN model loading system of claim 1, wherein: the model management module also provides the functions of downloading and uploading the model.
5. The input pruning-based edge-end DNN model loading system of claim 1, wherein: the compression mode in the compression mode selection module comprises: the compression mode comprises a compression mode based on an input and output association pruning strategy, a compression mode based on an input and output similarity pruning strategy, a compression mode based on an input and output association and then based on an input and output similarity pruning strategy and a compression mode based on an input and output association pruning strategy.
6. The input pruning-based edge-end DNN model loading system of claim 1, wherein: the compressed model threshold includes a model size threshold and a model accuracy threshold.
7. The method for model loading by using the input pruning based edge-end DNN model loading system of any one of the preceding claims 1 to 6, characterized by comprising the following steps:
step 1: performing missing value completion and normalization processing on the data collected by the edge end, storing the data into a file, and further storing the data stored into the file into a sample information table of a database;
step 2: acquiring data in a sample information table, setting relevant parameters of a training model by a user, training the specified model, and storing basic information of the trained model, including evaluation indexes, model size, model name, model storage path, model input attribute and output attribute number information, in a model information table of a database;
and step 3: according to data stored in the sample information table, a Bayesian network is constructed to analyze the relevance between the input and the output of the model, and the input influence degree of each input attribute on the output result is calculated, so that a compression mode based on the relevance pruning strategy between the input and the output is constructed;
and 4, step 4: analyzing the correlation of the input attributes by using a clustering algorithm according to the input influence degree of each input attribute obtained in the step 3 to construct a compression mode based on an inter-input relation pruning strategy;
and 5: arranging and combining the compression modes constructed in the step 3 and the step 4 to obtain four different compression modes, selecting different compression modes to compress the input data according to the relevance among different data, and compressing the model trained in the step 2 according to the compressed input data by using the Kolmogorov theorem;
step 6: retraining the model compressed in the step 5;
and 7: judging whether the model retraining result obtained in the step 6 reaches the threshold of the model size and the model accuracy set by the user, if so, determining that the compression is successful, storing the compressed model and data, and recording a compression log; otherwise, the compression is determined to be failed, and the failure reason and the related parameters are returned to provide parameter basis for the user to re-compress;
and 8: and loading the successfully compressed model to the edge terminal.
8. The method for model loading using an input pruning based edge-end DNN model loading system according to claim 7, wherein the step 3 comprises the sub-steps of:
step 3.1: designing a Bayesian network structure according to initial model input data in a sample information table, training the Bayesian network, and setting a model size and a model accuracy threshold;
step 3.2: calculating the influence degree of each input attribute of the model on the output according to the trained Bayesian network;
step 3.3: a bi-directional list representing the degree of influence of the input is constructed using a multi-objective optimization algorithm pareto optimality.
9. The method for model loading using an input pruning based edge-end DNN model loading system according to claim 7, wherein the step 4 comprises the sub-steps of:
step 4.1: performing clustering analysis by using a k-means clustering algorithm according to the input influence degree of each input attribute obtained in the step 3, and clustering and sorting the attributes with smaller input influence degree difference and higher similarity;
step 4.2: and on the basis of the step 4.1, compressing the input data according to the set pruning number and proportion.
CN202110973801.6A 2021-08-24 2021-08-24 Edge end DNN model loading system and method based on input pruning Pending CN113673684A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110973801.6A CN113673684A (en) 2021-08-24 2021-08-24 Edge end DNN model loading system and method based on input pruning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110973801.6A CN113673684A (en) 2021-08-24 2021-08-24 Edge end DNN model loading system and method based on input pruning

Publications (1)

Publication Number Publication Date
CN113673684A true CN113673684A (en) 2021-11-19

Family

ID=78545571

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110973801.6A Pending CN113673684A (en) 2021-08-24 2021-08-24 Edge end DNN model loading system and method based on input pruning

Country Status (1)

Country Link
CN (1) CN113673684A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102955857A (en) * 2012-11-09 2013-03-06 北京航空航天大学 Class center compression transformation-based text clustering method in search engine
CN111144561A (en) * 2018-11-05 2020-05-12 杭州海康威视数字技术股份有限公司 Neural network model determining method and device
CN111709522A (en) * 2020-05-21 2020-09-25 哈尔滨工业大学 Deep learning target detection system based on server-embedded cooperation
CN112132005A (en) * 2020-09-21 2020-12-25 福州大学 Face detection method based on cluster analysis and model compression
CN112184391A (en) * 2020-10-16 2021-01-05 中国科学院计算技术研究所 Recommendation model training method, medium, electronic device and recommendation model
CN112200104A (en) * 2020-10-15 2021-01-08 重庆科技学院 Chemical engineering fault diagnosis method based on novel Bayesian framework for enhanced principal component analysis
US20210081789A1 (en) * 2019-09-13 2021-03-18 Latent AI, Inc. Optimizing execution of a neural network based on operational performance parameters
CN112685139A (en) * 2021-01-11 2021-04-20 东北大学 K8S and Kubeedge-based cloud edge deep learning model management system and model training method
CN112906294A (en) * 2021-01-28 2021-06-04 三星(中国)半导体有限公司 Quantization method and quantization device for deep learning model
CN112948532A (en) * 2021-04-08 2021-06-11 河南高通物联网有限公司 Chain table data compression strategy selection method and system based on industrial big data analysis

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102955857A (en) * 2012-11-09 2013-03-06 北京航空航天大学 Class center compression transformation-based text clustering method in search engine
CN111144561A (en) * 2018-11-05 2020-05-12 杭州海康威视数字技术股份有限公司 Neural network model determining method and device
US20210081789A1 (en) * 2019-09-13 2021-03-18 Latent AI, Inc. Optimizing execution of a neural network based on operational performance parameters
CN111709522A (en) * 2020-05-21 2020-09-25 哈尔滨工业大学 Deep learning target detection system based on server-embedded cooperation
CN112132005A (en) * 2020-09-21 2020-12-25 福州大学 Face detection method based on cluster analysis and model compression
CN112200104A (en) * 2020-10-15 2021-01-08 重庆科技学院 Chemical engineering fault diagnosis method based on novel Bayesian framework for enhanced principal component analysis
CN112184391A (en) * 2020-10-16 2021-01-05 中国科学院计算技术研究所 Recommendation model training method, medium, electronic device and recommendation model
CN112685139A (en) * 2021-01-11 2021-04-20 东北大学 K8S and Kubeedge-based cloud edge deep learning model management system and model training method
CN112906294A (en) * 2021-01-28 2021-06-04 三星(中国)半导体有限公司 Quantization method and quantization device for deep learning model
CN112948532A (en) * 2021-04-08 2021-06-11 河南高通物联网有限公司 Chain table data compression strategy selection method and system based on industrial big data analysis

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
VAHIDEH HAYYOLALAM等: "Edge Intelligence for Empowering IoT-Based Healthcare Systems", 《IEEE WIRELESS COMMUNICATIONS》, vol. 28, no. 3, 30 June 2021 (2021-06-30), pages 6 - 14, XP011867282, DOI: 10.1109/MWC.001.2000345 *
王振玺等: "列存储数据区级压缩模式与压缩策略选择方法", 《计算机学报》, vol. 33, no. 8, 31 August 2010 (2010-08-31), pages 1523 - 1530 *
赖叶静等: "深度神经网络模型压缩方法与进展", 《华东师范大学学报 (自然科学版)》, no. 5, 30 September 2020 (2020-09-30), pages 68 - 82 *
赵海丰: "关联规则挖掘及贝叶斯网表示研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 2007, 31 May 2007 (2007-05-31), pages 140 - 39 *

Similar Documents

Publication Publication Date Title
CN111444236B (en) Mobile terminal user portrait construction method and system based on big data
CN111177178B (en) Data processing method and related equipment
WO2019015631A1 (en) Method for generating combined features for machine learning samples and system
CN109961147B (en) Automatic model compression method based on Q-Learning algorithm
CN110109899B (en) Internet of things data filling method, device and system
KR20200010624A (en) Big Data Integrated Diagnosis Prediction System Using Machine Learning
CN112699605A (en) Charging pile fault element prediction method and system
CN104077328A (en) Operation diagnosis method and device for MapReduce distributed system
CN104391879A (en) Method and device for hierarchical clustering
CN116821646A (en) Data processing chain construction method, data reduction method, device, equipment and medium
Vychuzhanin et al. Analysis and structuring diagnostic large volume data of technical condition of complex equipment in transport
CN108073582B (en) Computing framework selection method and device
CN116628451B (en) High-speed analysis method for information to be processed
CN114884515A (en) System, method and apparatus for adaptive data compression in a time series database
CN117318033B (en) Power grid data management method and system combining data twinning
CN114510526A (en) Online numerical control exhibition method
CN113673684A (en) Edge end DNN model loading system and method based on input pruning
CN113360353A (en) Test server and cloud platform
CN108280224B (en) Ten thousand grades of dimension data generation methods, device, equipment and storage medium
CN106227857B (en) Data-pushing and loading method and device
CN110555008B (en) Generator comprehensive diagnosis and state evaluation cloud platform
US11797577B2 (en) Smart data warehouse for cloud-based reservoir simulation
CN115374212A (en) Computing method based on artificial intelligence data processing
Lei et al. Application of distributed machine learning model in fault diagnosis of air preheater
CN111861404B (en) Data processing method and device based on intelligent machine and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination