CN114548884B - A package identification method and system based on pruning lightweight model - Google Patents

A package identification method and system based on pruning lightweight model Download PDF

Info

Publication number
CN114548884B
CN114548884B CN202210447343.7A CN202210447343A CN114548884B CN 114548884 B CN114548884 B CN 114548884B CN 202210447343 A CN202210447343 A CN 202210447343A CN 114548884 B CN114548884 B CN 114548884B
Authority
CN
China
Prior art keywords
matrix
pruning
entropy
von neumann
channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210447343.7A
Other languages
Chinese (zh)
Other versions
CN114548884A (en
Inventor
史朝坤
刁华彬
许绍云
郝悦星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Microelectronics of CAS
Original Assignee
Institute of Microelectronics of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Microelectronics of CAS filed Critical Institute of Microelectronics of CAS
Priority to CN202210447343.7A priority Critical patent/CN114548884B/en
Publication of CN114548884A publication Critical patent/CN114548884A/en
Application granted granted Critical
Publication of CN114548884B publication Critical patent/CN114548884B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/083Shipping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Image Analysis (AREA)

Abstract

本发明涉及一种基于剪枝轻量化模型的包裹识别方法和系统,属于图像识别技术领域,解决了现有神经网络模型运算复杂度高、时间和内存消耗大以及难以在终端设备上部署的问题。包括将训练图片输入待剪枝神经网络模型,提取各卷积层的特征图矩阵;将特征图矩阵转换为带权无向图,构建改进后的拉普拉斯矩阵,计算出冯·诺依曼图熵作为原始值;依次删除带权无向图中单个顶点得到新的带权无向图,计算新的带权无向图的冯·诺依曼图熵相对于原始值的变化值;根据冯·诺依曼图熵的变化值计算通道的重要性,对通道进行剪枝得到剪枝轻量化模型;将剪枝轻量化模型部署至包裹识别终端设备,识别实时采集的图片。实现了模型的高剪枝率,提升了实时包裹识别效率。

Figure 202210447343

The invention relates to a package recognition method and system based on a pruning lightweight model, belongs to the technical field of image recognition, and solves the problems of high computational complexity, high time and memory consumption, and difficulty in deployment on terminal equipment of the existing neural network model . Including inputting the training image into the neural network model to be pruned, extracting the feature map matrix of each convolutional layer; converting the feature map matrix into a weighted undirected graph, constructing an improved Laplacian matrix, and calculating Von Neuy Man graph entropy is taken as the original value; delete single vertices in the weighted undirected graph in turn to obtain a new weighted undirected graph, and calculate the change value of the von Neumann graph entropy of the new weighted undirected graph relative to the original value; Calculate the importance of the channel according to the change value of the entropy of the von Neumann graph, prune the channel to obtain the pruned lightweight model; deploy the pruned lightweight model to the package identification terminal device to identify the real-time collected pictures. The high pruning rate of the model is achieved and the efficiency of real-time package recognition is improved.

Figure 202210447343

Description

一种基于剪枝轻量化模型的包裹识别方法和系统A package identification method and system based on pruning lightweight model

技术领域technical field

本发明涉及图像识别技术领域,尤其涉及一种基于剪枝轻量化模型的包裹识别方法和系统。The invention relates to the technical field of image recognition, in particular to a package recognition method and system based on a pruning lightweight model.

背景技术Background technique

近年来,我国快递业务总体呈快速增长趋势,发展潜力巨大。为满足快速增长的快递业务量带来的分拣需求,快递业对物流中转中心的快递分拣能力提出了高要求。现代物流使用物流包裹自动分拣系统将包裹按照不同地区分拣开,其中包裹的识别检测是快递分拣的重要组成部分。In recent years, my country's express delivery business has shown a rapid growth trend overall, with huge development potential. In order to meet the sorting needs brought about by the rapid growth of express delivery business, the express delivery industry has put forward high requirements for the express sorting capacity of logistics transit centers. Modern logistics uses the logistics parcel automatic sorting system to sort the parcels according to different regions, in which the identification and detection of the parcels is an important part of express sorting.

在当前的图像识别方法中,传统图像处理方法在复杂的包裹分拣场景中存在人为设计特征困难等问题,相比之下,基于深度学习的目标识别算法具有明显的技术优势。In the current image recognition methods, traditional image processing methods have problems such as difficulty in artificially designing features in complex parcel sorting scenarios. In contrast, target recognition algorithms based on deep learning have obvious technical advantages.

随着深度学习理论的不断发展,深度卷积神经网络向着更深的网络层数和更宽的网络架构推进。一方面,这种复杂模型可以有效提升神经网络学习能力,从而获得更好的性能指标;另一方面,这会导致网络参数量和浮点数运算量急剧增长,这意味着更高的能耗、更大的内存占用以及更长的训练推理时间。长久以来,深度卷积神经网络对使用场景要求比较严格,往往部署在服务器端,但并不能满足实际需求。在物流分拣场景中,实时包裹识别要求深度学习模型具有参数少、运算快等特点,而复杂模型的推理时间以及终端与服务器之间的通讯延迟都无法满足快速处理包裹图像的要求,极大地限制了深度学习在该领域的推广应用。With the continuous development of deep learning theory, deep convolutional neural networks are advancing towards deeper network layers and wider network architectures. On the one hand, this kind of complex model can effectively improve the learning ability of the neural network, so as to obtain better performance indicators; Larger memory footprint and longer training inference time. For a long time, deep convolutional neural networks have strict requirements on usage scenarios and are often deployed on the server side, but they cannot meet actual needs. In logistics sorting scenarios, real-time package recognition requires deep learning models with few parameters and fast computation. However, the inference time of complex models and the communication delay between the terminal and the server cannot meet the requirements of fast processing of package images, which greatly reduces the number of parameters. It limits the promotion and application of deep learning in this field.

发明内容SUMMARY OF THE INVENTION

鉴于上述的分析,本发明实施例旨在提供一种基于剪枝轻量化模型的包裹识别方法和系统,用以解决现有神经网络模型运算复杂度高、时间和内存消耗大以及难以在终端设备上部署的问题。In view of the above analysis, the embodiments of the present invention aim to provide a package identification method and system based on a pruned lightweight model, so as to solve the problem that the existing neural network model has high computational complexity, high time and memory consumption, and is difficult to use in terminal devices. deployment issues.

一方面,本发明实施例提供了一种基于剪枝轻量化模型的包裹识别方法,包括以下步骤:On the one hand, an embodiment of the present invention provides a package identification method based on a pruning lightweight model, including the following steps:

将训练图片输入已预训练的待剪枝神经网络模型,提取各卷积层的特征图矩阵;Input the training image into the pre-trained neural network model to be pruned, and extract the feature map matrix of each convolution layer;

将各卷积层的特征图矩阵转换为带权无向图,根据特征图矩阵的幅值矩阵构建改进后的拉普拉斯矩阵,并计算出冯·诺依曼图熵作为各卷积层的原始值;依次删除各卷积层的带权无向图中单个顶点得到新的带权无向图,计算出每个新的带权无向图的冯·诺依曼图熵相对于原始值的变化值;Convert the feature map matrix of each convolutional layer into a weighted undirected map, construct an improved Laplacian matrix according to the magnitude matrix of the feature map matrix, and calculate the von Neumann map entropy as each convolutional layer. The original value of ; delete a single vertex in the weighted undirected graph of each convolutional layer in turn to obtain a new weighted undirected graph, and calculate the von Neumann graph entropy of each new weighted undirected graph relative to the original value. value change value;

根据各卷积层中冯·诺依曼图熵的变化值,计算各卷积层的通道的重要性,并根据各卷积层的剪枝率和通道的重要性,对各卷积层的通道进行剪枝,训练剪枝后的神经网络模型,得到剪枝轻量化模型;According to the change value of the von Neumann graph entropy in each convolutional layer, the importance of the channel of each convolutional layer is calculated, and according to the pruning rate of each convolutional layer and the importance of the channel, the The channel is pruned, the pruned neural network model is trained, and the pruned lightweight model is obtained;

将剪枝轻量化模型部署至包裹识别终端设备,对实时采集的图片识别出其中的包裹信息。Deploy the pruning lightweight model to the package identification terminal equipment, and identify the package information in the real-time collected pictures.

基于上述方法的进一步改进,将各卷积层的特征图矩阵转换为带权无向图,包括:Based on the further improvement of the above method, the feature map matrix of each convolutional layer is converted into a weighted undirected map, including:

将各卷积层的特征图矩阵中的每个三维特征图进行变形,得到特征行向量矩阵,其中每一行是与每个通道对应的特征行向量,分别作为各卷积层的带权无向图的顶点;Transform each three-dimensional feature map in the feature map matrix of each convolution layer to obtain a feature row vector matrix, where each row is a feature row vector corresponding to each channel, which is used as the weighted undirected weight of each convolution layer. vertex of the graph;

计算各卷积层的特征行向量矩阵中任意两个特征行向量的余弦距离,作为各卷积层的带权无向图中对应两个顶点之间的边权重;Calculate the cosine distance of any two feature row vectors in the feature row vector matrix of each convolution layer as the edge weight between the corresponding two vertices in the weighted undirected graph of each convolution layer;

根据带权无向图,获取邻接矩阵和度矩阵。According to the weighted undirected graph, get the adjacency matrix and degree matrix.

基于上述方法的进一步改进,邻接矩阵是一个实对称矩阵,其非对角元素为对应两个顶点之间的边权重,对角元素为0;度矩阵是一个对角矩阵,其每行对角元素为邻接矩阵中相应行的所有元素之和。Based on the further improvement of the above method, the adjacency matrix is a real symmetric matrix, and its off-diagonal elements are the edge weights between the corresponding two vertices, and the diagonal elements are 0; the degree matrix is a diagonal matrix, and each row is diagonal. The element is the sum of all elements of the corresponding row in the adjacency matrix.

基于上述方法的进一步改进,根据特征图矩阵的幅值矩阵构建改进后的拉普拉斯矩阵,包括:Based on the further improvement of the above method, an improved Laplacian matrix is constructed according to the magnitude matrix of the feature map matrix, including:

根据特征图矩阵,获取幅值矩阵,幅值矩阵是一个对角矩阵,其中每个对角元素为特征图矩阵中相应通道中所有元素的平方和;Obtain the magnitude matrix according to the feature map matrix. The magnitude matrix is a diagonal matrix, where each diagonal element is the sum of squares of all elements in the corresponding channel in the feature map matrix;

将度矩阵与幅值矩阵相乘后,再减去邻接矩阵,得到改进后的拉普拉斯矩阵。After multiplying the degree matrix by the magnitude matrix, and then subtracting the adjacency matrix, the improved Laplacian matrix is obtained.

基于上述方法的进一步改进,冯·诺依曼图熵通过下式计算得到:Based on the further improvement of the above method, the von Neumann graph entropy is calculated by the following formula:

Figure 837216DEST_PATH_IMAGE001
Figure 837216DEST_PATH_IMAGE001

其中,H i 表示第i个卷积层的冯·诺依曼图熵,L d 表示改进后的拉普拉斯矩阵;trace(•)表示矩阵的迹,即矩阵所有特征值之和;λ k 表示改进后的拉普拉斯矩阵L d 中第k个特征值,λ k ≥0,且

Figure 11846DEST_PATH_IMAGE002
n i 表示第i个卷积层的通道数量。Among them, H i represents the von Neumann graph entropy of the ith convolutional layer, L d represents the improved Laplacian matrix; trace(•) represents the trace of the matrix, that is, the sum of all eigenvalues of the matrix; λ k represents the k -th eigenvalue in the improved Laplacian matrix L d , λ k ≥ 0, and
Figure 11846DEST_PATH_IMAGE002
, ni denotes the number of channels of the ith convolutional layer .

基于上述方法的进一步改进,依次删除各卷积层的带权无向图中单个顶点得到新的带权无向图,计算出每个新的带权无向图的冯·诺依曼图熵相对于原始值的变化值,包括:Based on the further improvement of the above method, a single vertex in the weighted undirected graph of each convolutional layer is deleted in turn to obtain a new weighted undirected graph, and the von Neumann graph entropy of each new weighted undirected graph is calculated. Changes from the original value, including:

针对每个卷积层的带权无向图,依次删除单个顶点对应的特征行向量,对每次删除后得到的特征行向量矩阵重新转换为新的带权无向图,构造新的改进后的拉普拉斯矩阵,计算出新的冯·诺依曼图熵;For the weighted undirected graph of each convolutional layer, delete the feature row vector corresponding to a single vertex in turn, re-convert the feature row vector matrix obtained after each deletion into a new weighted undirected graph, and construct a new improved The Laplace matrix of , calculates the new von Neumann graph entropy;

计算每个新的冯·诺依曼图熵与对应的卷积层的原始值的差值绝对值,作为被删除的顶点对应的通道的冯·诺依曼图熵的变化值。The absolute value of the difference between each new von Neumann graph entropy and the original value of the corresponding convolutional layer is calculated as the change value of the von Neumann graph entropy of the channel corresponding to the removed vertex.

基于上述方法的进一步改进,根据各卷积层中冯·诺依曼图熵的变化值,计算各卷积层的通道的重要性,包括:Based on the further improvement of the above method, according to the change value of the von Neumann graph entropy in each convolutional layer, the importance of the channel of each convolutional layer is calculated, including:

根据每张训练图片得到的各卷积层的通道的冯·诺依曼图熵的变化值,计算出每个通道的冯·诺依曼图熵变化值的平均值,作为每个通道的重要性。According to the change value of the von Neumann map entropy of the channels of each convolutional layer obtained from each training picture, the average value of the change value of the von Neumann map entropy of each channel is calculated as the important value of each channel. sex.

基于上述方法的进一步改进,根据各卷积层的剪枝率和通道的重要性,对各卷积层的通道进行剪枝,是根据各卷积层的剪枝率,按照各卷积层的通道的重要性,从小到大对通道进行剪枝。Based on the further improvement of the above method, according to the pruning rate of each convolutional layer and the importance of the channel, the channel of each convolutional layer is pruned according to the pruning rate of each convolutional layer, according to the The importance of the channel, the channel is pruned from small to large.

基于上述方法的进一步改进,待剪枝神经网络模型是由具有CONV-BN-ReLU结构的卷积层构成的神经网络模型。Based on the further improvement of the above method, the neural network model to be pruned is a neural network model composed of convolutional layers with a CONV-BN-ReLU structure.

另一方面,本发明实施例提供了一种基于剪枝轻量化模型的包裹识别系统,包括:On the other hand, an embodiment of the present invention provides a package identification system based on a pruning lightweight model, including:

特征图矩阵提取模块,用于将训练图片输入已预训练的待剪枝神经网络模型,提取各卷积层的特征图矩阵;The feature map matrix extraction module is used to input the training image into the pre-trained neural network model to be pruned, and extract the feature map matrix of each convolutional layer;

冯·诺依曼图熵计算模块,用于将各卷积层的特征图矩阵转换为带权无向图,根据特征图矩阵的幅值矩阵构建改进后的拉普拉斯矩阵,并计算出冯·诺依曼图熵作为各卷积层的原始值;依次删除各卷积层的带权无向图中单个顶点得到新的带权无向图,计算出每个新的带权无向图的冯·诺依曼图熵相对于原始值的变化值;The Von Neumann graph entropy calculation module is used to convert the feature map matrix of each convolution layer into a weighted undirected graph, construct an improved Laplacian matrix according to the magnitude matrix of the feature map matrix, and calculate The von Neumann graph entropy is used as the original value of each convolutional layer; the single vertex in the weighted undirected graph of each convolutional layer is deleted in turn to obtain a new weighted undirected graph, and each new weighted undirected graph is calculated. The value of the change in the von Neumann graph entropy of the graph relative to the original value;

通道剪枝模块,用于根据各卷积层中冯·诺依曼图熵的变化值,计算各卷积层的通道的重要性,并根据各卷积层的剪枝率和通道的重要性,对各卷积层的通道进行剪枝,训练剪枝后的神经网络模型,得到剪枝轻量化模型;The channel pruning module is used to calculate the importance of the channel of each convolutional layer according to the change value of the von Neumann graph entropy in each convolutional layer, and according to the pruning rate of each convolutional layer and the importance of the channel , prune the channels of each convolutional layer, train the pruned neural network model, and obtain a pruned lightweight model;

包裹识别模块,用于将剪枝轻量化模型部署至包裹识别终端设备,对实时采集的图片识别出其中的包裹信息。The package identification module is used to deploy the pruning lightweight model to the package identification terminal equipment, and identify the package information in the real-time collected pictures.

与现有技术相比,本发明至少可实现如下有益效果之一:Compared with the prior art, the present invention can achieve at least one of the following beneficial effects:

1、将无向图和冯·诺依曼图熵等理论应用至神经网络剪枝中,将神经网络模型的特征图矩阵转换为带权无向图,同时考虑特征图的幅值和相关性,计算删除单个通道后的冯·诺依曼图熵变化值,以此作为剪枝依据,以保证剪枝后轻量化模型的性能,实现了显著的模型压缩与加速;1. Apply theories such as undirected graph and von Neumann graph entropy to neural network pruning, convert the feature map matrix of the neural network model into a weighted undirected graph, and consider the magnitude and correlation of the feature map. , calculate the entropy change value of the Von Neumann graph after deleting a single channel, and use this as the basis for pruning to ensure the performance of the lightweight model after pruning, and achieve significant model compression and acceleration;

2、特征图矩阵的冯·诺依曼图熵具有良好的稳定性,与输入训练图片的数量相关性较小,只需要少量训练图片即可得到准确的特征图通道重要性,使得技术方案剪枝效率高、内存占用低以及剪枝结果稳定;2. The von Neumann map entropy of the feature map matrix has good stability, and has little correlation with the number of input training pictures. Only a small number of training pictures are needed to obtain accurate feature map channel importance, which makes the technical solution cut High branch efficiency, low memory usage and stable pruning results;

3、本发明的通道剪枝方法适用于由具有CONV-BN-ReLU结构的卷积层构成的神经网络模型,在保持性能最大化的同时,尽可能缩减其参数量和运算量,实现神经网络模型轻量化,将压缩后的轻量化深度神经网络模型部署在资源受限的边缘设备,扩大了使用范围,实现了应用价值最大化。3. The channel pruning method of the present invention is suitable for a neural network model composed of a convolutional layer with a CONV-BN-ReLU structure. While maintaining the maximum performance, the amount of parameters and the amount of calculation are reduced as much as possible to realize the neural network. The model is lightweight, and the compressed lightweight deep neural network model is deployed on edge devices with limited resources, which expands the scope of use and maximizes the application value.

本发明中,上述各技术方案之间还可以相互组合,以实现更多的优选组合方案。本发明的其他特征和优点将在随后的说明书中阐述,并且,部分优点可从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过说明书以及附图中所特别指出的内容中来实现和获得。In the present invention, the above technical solutions can also be combined with each other to achieve more preferred combination solutions. Additional features and advantages of the invention will be set forth in the description which follows, and some of the advantages may become apparent from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by means of particularly pointed out in the description and drawings.

附图说明Description of drawings

附图仅用于示出具体实施例的目的,而并不认为是对本发明的限制,在整个附图中,相同的参考符号表示相同的部件;The drawings are only for the purpose of illustrating specific embodiments, and are not considered to be limitations of the present invention, and throughout the drawings, the same reference signs refer to the same components;

图1为本发明实施例一中一种基于剪枝轻量化模型的包裹识别方法的流程图;1 is a flowchart of a package identification method based on a pruning lightweight model in Embodiment 1 of the present invention;

图2为本发明实施例一中一种基于剪枝轻量化模型的包裹识别方法的剪枝示例图;2 is an example diagram of pruning of a package identification method based on a pruning lightweight model in Embodiment 1 of the present invention;

图3为本发明实施例二中一种基于剪枝轻量化模型的包裹识别系统的模块示意图。FIG. 3 is a schematic block diagram of a package identification system based on a pruning lightweight model in Embodiment 2 of the present invention.

具体实施方式Detailed ways

下面结合附图来具体描述本发明的优选实施例,其中,附图构成本申请一部分,并与本发明的实施例一起用于阐释本发明的原理,并非用于限定本发明的范围。The preferred embodiments of the present invention are specifically described below with reference to the accompanying drawings, wherein the accompanying drawings constitute a part of the present application, and together with the embodiments of the present invention, are used to explain the principles of the present invention, but are not used to limit the scope of the present invention.

在物流分拣场景中,实时包裹识别要求深度学习模型具有参数少、运算快等特点,而复杂模型的推理时间以及终端与服务器之间的通讯延迟都无法满足快速处理包裹图像的要求。针对当下深度神经网络模型参数量和浮点计算量大、时间/内存消耗大以及难以部署等问题,本发明提出了一种基于剪枝轻量化模型的包裹识别方法和系统,将无向图和冯·诺依曼图熵等理论应用至神经网络剪枝中,将特征图矩阵转换为带权无向图,计算剪除每个通道后冯·诺依曼图熵的变化值,以此作为衡量通道重要性的指标,在兼顾剪枝后轻量化模型的性能,实现了显著的模型压缩与加速,便于部署在资源受限的包裹识别终端,满足实时包裹识别的需求。In logistics sorting scenarios, real-time package recognition requires deep learning models with few parameters and fast computation. However, the inference time of complex models and the communication delay between terminals and servers cannot meet the requirements for fast processing of package images. Aiming at the problems of large amount of parameters and floating-point calculation of the current deep neural network model, large time/memory consumption, and difficulty in deployment, the present invention proposes a package identification method and system based on a pruning lightweight model. Theories such as von Neumann graph entropy are applied to neural network pruning, the feature map matrix is converted into a weighted undirected graph, and the change value of von Neumann graph entropy after each channel is pruned is calculated as a measure The index of channel importance, taking into account the performance of the lightweight model after pruning, achieves significant model compression and acceleration, which is convenient for deployment in resource-constrained package recognition terminals to meet the needs of real-time package recognition.

实施例一Example 1

本发明的一个具体实施例,公开了一种剪枝轻量化模型的包裹识别方法。如图1所示,该方法包括以下步骤:A specific embodiment of the present invention discloses a package identification method for pruning a lightweight model. As shown in Figure 1, the method includes the following steps:

S101:将训练图片输入已预训练的待剪枝神经网络模型,提取各卷积层的特征图矩阵;S101: Input the training image into the pre-trained neural network model to be pruned, and extract the feature map matrix of each convolutional layer;

需要说明的是,本实施例中并不限制待剪枝神经网络模型的具体网络结构,只要是由具有CONV-BN-ReLU结构的卷积层构成的神经网络模型即可。CONV-BN-ReLU结构广泛应用在多种主流卷积神经网络模型,使得本方法中的剪枝方案可以十分方便地应用在分类、识别等领域的主流神经网络模型中,实现神经网络模型轻量化。It should be noted that this embodiment does not limit the specific network structure of the neural network model to be pruned, as long as it is a neural network model composed of convolutional layers having a CONV-BN-ReLU structure. The CONV-BN-ReLU structure is widely used in a variety of mainstream convolutional neural network models, so that the pruning scheme in this method can be easily applied to mainstream neural network models in the fields of classification and recognition, and realizes lightweight neural network models. .

示例性地,待剪枝神经网络模型可以是YOLO和ResNet系列的神经网络模型。Exemplarily, the neural network model to be pruned may be a neural network model of the YOLO and ResNet series.

需要说明的是,通过包裹分拣传送带上摄像头的连续拍照,采集图片,对其进行标注,对图像数据进行缩放等预处理,构建包裹图片数据集,用于预训练待剪枝神经网络模型。本步骤中对已预训练的待剪枝神经网络模型开始进行剪枝轻量化,所需的训练图片从包裹图片数据集中随机抽取。当使用更多训练图片时,理论上可以获得更加准确的特征图通道重要性,但是这会导致剪枝时间增加,剪枝效率降低,且剪枝性能并没有显著提升;当使用更少训练图片时,则无法获得较为精确的特征图通道重要性,剪枝效果变差。因此,为了兼顾剪枝效率和剪枝性能,使用600~1000张图片进行剪枝。优选地,使用640张训练图片,分5个批次,每个批次输入128张图片。It should be noted that the package image dataset is constructed by pre-processing the image data, such as capturing images, labeling them, and scaling the image data, which is used to pre-train the neural network model to be pruned. In this step, the pre-trained neural network model to be pruned starts to be pruned and lightweight, and the required training images are randomly selected from the package image data set. When using more training images, theoretically more accurate feature map channel importance can be obtained, but this will increase the pruning time, reduce the pruning efficiency, and the pruning performance is not significantly improved; when using fewer training images When , the more accurate feature map channel importance cannot be obtained, and the pruning effect becomes worse. Therefore, in order to take into account the pruning efficiency and pruning performance, 600-1000 images are used for pruning. Preferably, 640 training images are used, divided into 5 batches, and each batch is input with 128 images.

具体来说,将训练图片分批输入已预训练的待剪枝神经网络模型,提取的各卷积层的特征图矩阵是一个四维矩阵,包括每批次中每张图片产生的三维特征图,可以通过如下公式表示:Specifically, the training images are input into the pre-trained neural network model to be pruned in batches, and the extracted feature map matrix of each convolutional layer is a four-dimensional matrix, including the three-dimensional feature map generated by each image in each batch, It can be expressed by the following formula:

Figure 143750DEST_PATH_IMAGE003
Figure 143750DEST_PATH_IMAGE003

Figure 905295DEST_PATH_IMAGE004
Figure 905295DEST_PATH_IMAGE004

其中,

Figure 249688DEST_PATH_IMAGE005
表示第i个卷积层输出的特征图矩阵,
Figure 544403DEST_PATH_IMAGE006
表示特征图矩阵
Figure 581630DEST_PATH_IMAGE005
中的第j个特征图;
Figure 63427DEST_PATH_IMAGE007
表示在一个批次中输入到待剪枝的神经网络模型中的训练图片数量;n i 表示第i个卷积层输出特征图矩阵中每个特征图的通道数量;h i w i 分别表示特征图的高度和宽度;
Figure 477090DEST_PATH_IMAGE008
表示第i个卷积层输出特征图矩阵中第j个特征图
Figure 360733DEST_PATH_IMAGE006
的第k个通道图。in,
Figure 249688DEST_PATH_IMAGE005
represents the feature map matrix output by the ith convolutional layer,
Figure 544403DEST_PATH_IMAGE006
Representation feature map matrix
Figure 581630DEST_PATH_IMAGE005
The jth feature map in ;
Figure 63427DEST_PATH_IMAGE007
Represents the number of training images input to the neural network model to be pruned in a batch; n i represents the number of channels of each feature map in the output feature map matrix of the ith convolutional layer; h i and w i represent, respectively the height and width of the feature map;
Figure 477090DEST_PATH_IMAGE008
Represents the jth feature map in the output feature map matrix of the ith convolutional layer
Figure 360733DEST_PATH_IMAGE006
The kth channel map of .

示例性地,ResNet56神经网络模型中存在55个卷积层,在输入每个批次128张训练图片时提取各个卷积层产生的特征图矩阵,则5个批次总共可以提取到275个特征图矩阵,每个特征图矩阵中包括128个三维特征图。Exemplarily, there are 55 convolutional layers in the ResNet56 neural network model. When 128 training images per batch are input, the feature map matrix generated by each convolutional layer is extracted, and a total of 275 features can be extracted in 5 batches. Map matrix, each feature map matrix includes 128 three-dimensional feature maps.

S102:将各卷积层的特征图矩阵转换为带权无向图,根据特征图矩阵的幅值矩阵构建改进后的拉普拉斯矩阵,并计算出冯·诺依曼图熵作为各卷积层的原始值;依次删除各卷积层的带权无向图中单个顶点得到新的带权无向图,计算出每个新的带权无向图的冯·诺依曼图熵相对于对应的卷积层的原始值的变化值;S102: Convert the feature map matrix of each convolutional layer into a weighted undirected map, construct an improved Laplacian matrix according to the magnitude matrix of the feature map matrix, and calculate the von Neumann map entropy as each volume The original value of the convolution layer; delete a single vertex in the weighted undirected graph of each convolutional layer in turn to obtain a new weighted undirected graph, and calculate the relative von Neumann graph entropy of each new weighted undirected graph. The change value of the original value of the corresponding convolutional layer;

需要说明的是,将各卷积层的特征图矩阵转换为带权无向图,包括:It should be noted that the feature map matrix of each convolutional layer is converted into a weighted undirected map, including:

①将各卷积层的特征图矩阵中的每个三维特征图进行变形,得到特征行向量矩阵,其中每一行是与每个通道对应的特征行向量,分别作为各卷积层的带权无向图的顶点;① Deform each three-dimensional feature map in the feature map matrix of each convolution layer to obtain a feature row vector matrix, where each row is a feature row vector corresponding to each channel, which is used as the weighted and non-linear vector of each convolution layer. to the vertex of the graph;

具体来说,对提取到的特征图矩阵中的每个三维特征图(n i ,h i ,w i )进行变形,是保持通道数量n i 不变,将每一个通道下的(h i ,w i )矩阵的所有元素拉成一行,得到n i 个特征行向量(n i ,h i w i ),对应带权无向图中的n i 个顶点,其中,h i w i 分别表示特征图的高度和宽度,h i w i 表示变形后每个特征行向量中的元素数量。变形后得到的特征行向量矩阵通过如下公式表示:Specifically, deforming each three-dimensional feature map ( n i , h i , w i ) in the extracted feature map matrix is to keep the number of channels n i unchanged, and transform ( h i , wi ) under each channel w i ) All elements of the matrix are pulled into a row to obtain n i characteristic row vectors ( n i , h i w i ), corresponding to n i vertices in the weighted undirected graph, where h i and w i represent respectively The height and width of the feature map, h i w i represent the number of elements in each feature row vector after deformation. The eigenrow vector matrix obtained after deformation is expressed by the following formula:

Figure 834439DEST_PATH_IMAGE009
Figure 834439DEST_PATH_IMAGE009

其中,

Figure 302068DEST_PATH_IMAGE010
表示第i个卷积层第j个特征图变形后的特征行向量矩阵,
Figure 519422DEST_PATH_IMAGE010
中的
Figure 257571DEST_PATH_IMAGE011
表示与特征图中第k个通道对应的特征行向量,作为第i个卷积层的带权无向图G的一个顶点。in,
Figure 302068DEST_PATH_IMAGE010
represents the deformed feature row vector matrix of the jth feature map of the ith convolutional layer,
Figure 519422DEST_PATH_IMAGE010
middle
Figure 257571DEST_PATH_IMAGE011
Represents the feature row vector corresponding to the kth channel in the feature map as a vertex of the weighted undirected graph G of the ith convolutional layer.

②计算各卷积层的特征行向量矩阵中任意两个特征行向量的余弦距离,作为各卷积层的带权无向图中对应两个顶点之间的边权重;② Calculate the cosine distance of any two feature row vectors in the feature row vector matrix of each convolution layer as the edge weight between the corresponding two vertices in the weighted undirected graph of each convolution layer;

具体来说,依照公式(4)所示,计算

Figure 636600DEST_PATH_IMAGE010
中任意两个特征行向量
Figure 827410DEST_PATH_IMAGE012
Figure 582876DEST_PATH_IMAGE013
的余弦距离:Specifically, according to formula (4), the calculation
Figure 636600DEST_PATH_IMAGE010
Any two eigenrow vectors in
Figure 827410DEST_PATH_IMAGE012
and
Figure 582876DEST_PATH_IMAGE013
The cosine distance of :

Figure 441111DEST_PATH_IMAGE014
Figure 441111DEST_PATH_IMAGE014

其中,

Figure 522199DEST_PATH_IMAGE015
Figure 967349DEST_PATH_IMAGE016
分别表示特征行向量
Figure 260927DEST_PATH_IMAGE012
Figure 708089DEST_PATH_IMAGE013
中第m个元素。计算
Figure 694500DEST_PATH_IMAGE010
中所有行向量两两之间的余弦距离,作为转换得到的带权无向图G相应两个顶点之间边的权重。in,
Figure 522199DEST_PATH_IMAGE015
and
Figure 967349DEST_PATH_IMAGE016
Represent the feature row vector
Figure 260927DEST_PATH_IMAGE012
and
Figure 708089DEST_PATH_IMAGE013
The mth element in . calculate
Figure 694500DEST_PATH_IMAGE010
The cosine distance between all row vectors in , is used as the weight of the edge between the corresponding two vertices of the weighted undirected graph G obtained by conversion.

③根据带权无向图,获取邻接矩阵和度矩阵。③According to the weighted undirected graph, obtain the adjacency matrix and degree matrix.

需要说明的是,带权无向图的邻接矩阵是一个实对称矩阵,其非对角元素为对应两个顶点之间的边权重,对角元素为0,表示如下:It should be noted that the adjacency matrix of a weighted undirected graph is a real symmetric matrix, and its off-diagonal elements are the edge weights between the corresponding two vertices, and the diagonal elements are 0, which are expressed as follows:

Figure 125481DEST_PATH_IMAGE017
Figure 125481DEST_PATH_IMAGE017

其中,W表示带权无向图G的邻接矩阵,其非对角元素dis p,q 为顶点p和顶点q之间边的权重。Among them, W represents the adjacency matrix of the weighted undirected graph G , and its off-diagonal elements dis p, q are the weights of the edge between the vertex p and the vertex q .

带权无向图的度矩阵是一个对角矩阵,其每行对角元素为邻接矩阵中相应行的所有元素之和,表示如下:The degree matrix of a weighted undirected graph is a diagonal matrix, and the diagonal elements of each row are the sum of all elements of the corresponding row in the adjacency matrix, which are expressed as follows:

Figure 957171DEST_PATH_IMAGE018
Figure 957171DEST_PATH_IMAGE018

Figure 524419DEST_PATH_IMAGE019
Figure 524419DEST_PATH_IMAGE019

其中,S表示带权无向图G的度矩阵,其对角元素s k 为邻接矩阵Wk行所有元素之和。Among them, S represents the degree matrix of the weighted undirected graph G , and its diagonal element sk is the sum of all elements in the kth row of the adjacency matrix W.

通过以上三步,将各卷积层的特征图矩阵转换为带权无向图,用于描述各卷积层的通道的相关性。Through the above three steps, the feature map matrix of each convolutional layer is converted into a weighted undirected graph, which is used to describe the correlation of the channels of each convolutional layer.

本实施例为了得到更准确的剪枝标准,通过带权无向图描述通道相关性,通过特征图的幅值大小描述通道个体重要性,最终融合通道相关性和通道个体重要性,构建改进后的拉普拉斯矩阵。In this embodiment, in order to obtain a more accurate pruning standard, the channel correlation is described by a weighted undirected graph, the individual importance of the channel is described by the magnitude of the feature map, and finally the channel correlation and the individual channel importance are merged. the Laplace matrix.

具体来说,通过特征图的幅值矩阵来表示特征图的幅值大小,用于衡量各通道对神经网络模型的影响,幅值矩阵是一个对角矩阵,其中每个对角元素为特征图矩阵中相应通道中所有元素的平方和,公式如下所示:Specifically, the magnitude of the feature map is represented by the magnitude matrix of the feature map, which is used to measure the influence of each channel on the neural network model. The magnitude matrix is a diagonal matrix, and each diagonal element is a feature map. The sum of squares of all elements in the corresponding channel in the matrix, the formula is as follows:

Figure 681730DEST_PATH_IMAGE020
Figure 681730DEST_PATH_IMAGE020

Figure 600008DEST_PATH_IMAGE021
Figure 600008DEST_PATH_IMAGE021

其中,D表示特征图的幅值矩阵,其对角元素d k 为特征图矩阵第k个通道中所有元素的平方和。Among them, D represents the magnitude matrix of the feature map, and its diagonal element d k is the sum of squares of all elements in the kth channel of the feature map matrix.

将度矩阵与幅值矩阵相乘后,再减去邻接矩阵,得到改进后的拉普拉斯矩阵,公式如下所示:After multiplying the degree matrix by the magnitude matrix, and then subtracting the adjacency matrix, the improved Laplacian matrix is obtained, and the formula is as follows:

Figure 969809DEST_PATH_IMAGE022
Figure 969809DEST_PATH_IMAGE022

其中,L d 表示改进后的拉普拉斯矩阵,是一个实对称半正定矩阵,*表示矩阵对应元素相乘。Among them, L d represents the improved Laplacian matrix, which is a real symmetric positive semi-definite matrix, and * represents the multiplication of the corresponding elements of the matrix.

根据改进后的拉普拉斯矩阵,通过下式计算得到冯·诺依曼图熵:According to the improved Laplace matrix, the von Neumann graph entropy is calculated by the following formula:

Figure 173256DEST_PATH_IMAGE023
Figure 173256DEST_PATH_IMAGE023

其中,H i 表示第i个卷积层的冯·诺依曼图熵,trace(•)表示矩阵的迹,即矩阵所有特征值之和;λ k 表示改进后的拉普拉斯矩阵L d 中第k个特征值,λ k ≥0,且

Figure 235890DEST_PATH_IMAGE002
n i 表示第i个卷积层的通道数量。Among them, H i represents the von Neumann graph entropy of the ith convolutional layer, trace(•) represents the trace of the matrix, that is, the sum of all eigenvalues of the matrix; λ k represents the improved Laplacian matrix L d The k -th eigenvalue in , λ k ≥ 0, and
Figure 235890DEST_PATH_IMAGE002
, ni denotes the number of channels of the ith convolutional layer .

需要说明的是,冯·诺依曼图熵具有良好的稳定性,与输入训练图片的数量相关性较小,只需要少量训练图片即可得到准确的特征图通道重要性。根据各卷积层的特征图矩阵计算出来的冯·诺依曼图熵作为各卷积层的原始值,接下来,计算删除单个通道后的冯·诺依曼图熵变化值,以此作为剪枝依据。It should be noted that the von Neumann graph entropy has good stability and is less correlated with the number of input training images, and only a small number of training images are needed to obtain accurate feature map channel importance. The von Neumann map entropy calculated according to the feature map matrix of each convolution layer is used as the original value of each convolution layer. Next, the change value of the von Neumann map entropy after deleting a single channel is calculated as Pruning basis.

具体来说,依次删除各卷积层的带权无向图中单个顶点得到新的带权无向图,计算出每个新的带权无向图的冯·诺依曼图熵相对于对应的卷积层的原始值的变化值,包括:Specifically, delete a single vertex in the weighted undirected graph of each convolutional layer in turn to obtain a new weighted undirected graph, and calculate the von Neumann graph entropy of each new weighted undirected graph relative to the corresponding The changing values of the original values of the convolutional layers, including:

针对每个卷积层的带权无向图,依次删除单个顶点对应的特征行向量,对每次删除后得到的特征行向量矩阵重新转换为新的带权无向图,构造新的改进后的拉普拉斯矩阵,计算出新的冯·诺依曼图熵;For the weighted undirected graph of each convolutional layer, delete the feature row vector corresponding to a single vertex in turn, re-convert the feature row vector matrix obtained after each deletion into a new weighted undirected graph, and construct a new improved The Laplace matrix of , calculates the new von Neumann graph entropy;

需要说明的是,从原始图G中删除单个顶点,即从特征行向量矩阵

Figure 375884DEST_PATH_IMAGE010
中删除对应的特征行向量,按照公式(4)-(9)重新构造带权无向图G ,根据公式(10)得到新的改进后的拉普拉斯矩阵
Figure 80535DEST_PATH_IMAGE024
,以及根据公式(11)得到新的冯·诺依曼图熵H 。It should be noted that a single vertex is removed from the original graph G , i.e. from the eigenrow vector matrix
Figure 375884DEST_PATH_IMAGE010
Delete the corresponding eigenline vector in the , reconstruct the weighted undirected graph G ' according to formulas (4)-(9), and obtain a new improved Laplacian matrix according to formula (10).
Figure 80535DEST_PATH_IMAGE024
, and the new von Neumann graph entropy H ' is obtained according to formula (11).

计算每个新的冯·诺依曼图熵与对应的卷积层的原始值的差值绝对值,作为被删除的顶点对应的通道的冯·诺依曼图熵的变化值,公式如下所示:Calculate the absolute value of the difference between each new von Neumann graph entropy and the original value of the corresponding convolutional layer, as the change value of the von Neumann graph entropy of the channel corresponding to the deleted vertex, the formula is as follows Show:

Figure 622375DEST_PATH_IMAGE025
Figure 622375DEST_PATH_IMAGE025

其中,abs(•)表示求取绝对值运算。Among them, abs(•) represents the operation to obtain the absolute value.

S103:根据各卷积层中冯·诺依曼图熵的变化值,计算各卷积层的通道的重要性,并根据各卷积层的剪枝率,对各卷积层的通道进行剪枝,训练剪枝后的神经网络模型,得到剪枝轻量化模型;S103: Calculate the importance of the channels of each convolutional layer according to the change value of the von Neumann graph entropy in each convolutional layer, and prune the channels of each convolutional layer according to the pruning rate of each convolutional layer branch, train the pruned neural network model to obtain a pruned lightweight model;

需要说明的是,冯·诺依曼图熵有助于测量图之间的信息差异和距离,将某一通道删除后,若新的冯·诺依曼图熵变化明显,则表明该通道重要性高,在剪枝过程中需要保留,以保证剪枝后轻量化模型的性能。因此,模型通道剪枝的目标就是,在最大限度保留原有神经网络模型性能和泛化性的基础上,从神经网络模型中减去不重要的卷积通道。此目标可以定义为如下公式所示:It should be noted that the von Neumann graph entropy helps to measure the information difference and distance between graphs. After a channel is deleted, if the new von Neumann graph entropy changes significantly, it indicates that the channel is important It has high performance and needs to be retained during the pruning process to ensure the performance of the lightweight model after pruning. Therefore, the goal of model channel pruning is to subtract unimportant convolutional channels from the neural network model while preserving the performance and generalization of the original neural network model to the greatest extent possible. This goal can be defined as follows:

Figure 121489DEST_PATH_IMAGE026
Figure 121489DEST_PATH_IMAGE026

其中,Acc表示剪枝后神经网络模型性能;

Figure 483200DEST_PATH_IMAGE027
表示第k个通道的掩膜,取值为0或1,当
Figure 460384DEST_PATH_IMAGE027
取0时表示该通道被剪除,当
Figure 856730DEST_PATH_IMAGE027
取1时表示该通道被保留;CI(b k )表示第k个通道b k 的重要性;n i 表示该卷积层中的通道数量;
Figure 28211DEST_PATH_IMAGE028
表示剪枝后该卷积层保留的通道数量。Among them, Acc represents the performance of the neural network model after pruning;
Figure 483200DEST_PATH_IMAGE027
Represents the mask of the kth channel, which takes a value of 0 or 1, when
Figure 460384DEST_PATH_IMAGE027
When 0 is taken, it means that the channel is pruned, when
Figure 856730DEST_PATH_IMAGE027
When 1 is taken, it means that the channel is reserved; CI( b k ) means the importance of the kth channel b k ; n i means the number of channels in the convolutional layer;
Figure 28211DEST_PATH_IMAGE028
Indicates the number of channels retained by this convolutional layer after pruning.

各卷积层的通道的重要性根据各卷积层中冯·诺依曼图熵的变化值,依照如下公式近似得到:The importance of the channels of each convolutional layer is approximated by the following formula according to the change value of the von Neumann graph entropy in each convolutional layer:

Figure 142797DEST_PATH_IMAGE029
Figure 142797DEST_PATH_IMAGE029

其中,ΔH k,j 表示输入第j个图片时该卷积层第k个通道b k 得到的冯·诺依曼图熵变化值;

Figure 923671DEST_PATH_IMAGE030
表示输入的图片总数量。Among them, ΔH k,j represents the von Neumann graph entropy change value obtained by the kth channel b k of the convolutional layer when the jth image is input;
Figure 923671DEST_PATH_IMAGE030
Indicates the total number of images entered.

根据公式(13)和公式(14)可知,为了使剪枝后的模型性能最大化,可以等价为保留重要性最大的通道。因此,根据各卷积层的剪枝率,对各卷积层的通道进行剪枝,是根据各卷积层的剪枝率,按照各卷积层的通道的重要性,从小到大对通道进行剪枝。According to formula (13) and formula (14), in order to maximize the performance of the model after pruning, it can be equivalent to retain the most important channel. Therefore, according to the pruning rate of each convolutional layer, the channel of each convolutional layer is pruned according to the pruning rate of each convolutional layer and the importance of the channel of each convolutional layer, from small to large. Perform pruning.

以ResNet56神经网络模型为例,该神经网络模型包含有55个卷积层,那么就对这55个卷积层中每一层的所有通道按照重要性从小到大分别进行排序。Taking the ResNet56 neural network model as an example, the neural network model contains 55 convolutional layers, then all channels of each layer in the 55 convolutional layers are sorted according to their importance from small to large.

图2是以一个卷积层为例,展示了本实施例中通过通道剪枝得到剪枝轻量化模型的过程。在图2中,该卷积层有4个通道,将1张图片输入该卷积层后,构建出1个包含4个顶点的带权无向图,根据其计算的冯·诺依曼图熵作为该卷积层的原始值,再依次删除单个顶点,构建出4个包含3个顶点的新的带权无向图,分别得到新的冯·诺依曼图熵,并与原始值对比,得到4个通道的冯·诺依曼图熵变化值。由于是1张图片,变化值的大小相当于重要性的大小,当剪枝率是0.5时,按照从小到大排序,将变化值比较小的通道4和通道1剪枝,保留通道2和通道3。FIG. 2 takes a convolutional layer as an example, showing the process of obtaining a pruned lightweight model through channel pruning in this embodiment. In Figure 2, the convolutional layer has 4 channels. After inputting a picture into the convolutional layer, a weighted undirected graph with 4 vertices is constructed, and the von Neumann graph calculated according to it is constructed. The entropy is used as the original value of the convolutional layer, and then a single vertex is deleted in turn to construct 4 new weighted undirected graphs containing 3 vertices, and the new von Neumann graph entropy is obtained and compared with the original value. , the von Neumann graph entropy change values for 4 channels are obtained. Since it is a picture, the size of the change value is equivalent to the size of the importance. When the pruning rate is 0.5, sort from small to large, and prune channels 4 and 1 with relatively small change values, and keep channels 2 and 1. 3.

需要说明的是,对于神经网络模型来说,同一卷积层输出特征图的通道重要性具有可比性,而不同卷积层的特征图通道对模型的影响不同,因此给定神经网络模型通道剪枝率、通道重要性排序以及剪枝操作都是在模型不同卷积层中分层进行的。其中神经网络模型通道剪枝率可以依据实际需要设置,以实现模型性能和模型计算量之间的平衡。根据实验发现,相比深层卷积通道来说,模型的浅层卷积通道往往更重要,因此浅层卷积通道剪枝率一般会小于深层卷积通道剪枝率。另外,对于残差网络还需要考虑残差输出通道的剪枝率与主干网络输出通道的剪枝率相匹配,以保证两个输出特征可以实现相加操作。It should be noted that for the neural network model, the channel importance of the output feature map of the same convolutional layer is comparable, and the feature map channels of different convolutional layers have different effects on the model, so given the neural network model channel cut Branch rate, channel importance ranking, and pruning operations are performed hierarchically in different convolutional layers of the model. The neural network model channel pruning rate can be set according to actual needs to achieve a balance between model performance and model computation. According to the experiment, the shallow convolution channel of the model is often more important than the deep convolution channel, so the pruning rate of the shallow convolution channel is generally smaller than the pruning rate of the deep convolution channel. In addition, for the residual network, it is also necessary to consider that the pruning rate of the residual output channel matches the pruning rate of the output channel of the backbone network to ensure that the two output features can be added.

在本实施例中,对剪枝后的神经网络模型进行微调训练,包括:设置微调训练超参数,包括初始学习率、带预热机制的余弦退火衰减策略、批次大小、迭代次数、带有动量和权重衰减的随机梯度下降以及分层剪枝率,在GPU上对剪枝后的轻量化模型进行微调训练,得到训练好的剪枝轻量化模型。In this embodiment, performing fine-tuning training on the pruned neural network model includes: setting fine-tuning training hyperparameters, including initial learning rate, cosine annealing decay strategy with preheating mechanism, batch size, number of iterations, and Stochastic gradient descent with momentum and weight decay and hierarchical pruning rate, fine-tune the pruned lightweight model on the GPU, and obtain a trained pruned lightweight model.

S104:将剪枝轻量化模型部署至包裹识别终端设备,对实时采集的图片识别出其中的包裹信息。S104: Deploy the pruning lightweight model to the package identification terminal device, and identify the package information in the real-time collected pictures.

具体来说,基于实时采集的图片,识别出图片中包裹的数量和位置信息,统计出包裹识别的指标结果,包括:模型大小、识别速度和识别精度。Specifically, based on the pictures collected in real time, the number and location information of the packages in the pictures are identified, and the index results of the package identification are counted, including: model size, recognition speed and recognition accuracy.

示例性地,在CIFAR-10数据集中,对ResNet56神经网络模型进行剪枝操作,可以实现42.8%参数量和47.4%浮点计算量的降低,且剪枝后轻量化模型的精度提升了1.02%;在ImageNet数据集中,对ResNet50神经网络模型进行剪枝操作,可以实现40.8%参数量和44.8%浮点计算量的降低,且剪枝后轻量化模型的精度提升了0.25%。与未剪枝的神经网络模型相比,本实施例的方法实现了显著的模型压缩与加速,且在相同甚至更高的剪枝率下,可以取得比其他类似方法更好的模型精度。Exemplarily, in the CIFAR-10 dataset, pruning the ResNet56 neural network model can reduce the amount of parameters by 42.8% and the amount of floating-point calculations by 47.4%, and the accuracy of the lightweight model after pruning is improved by 1.02%. ; In the ImageNet dataset, pruning the ResNet50 neural network model can reduce 40.8% of parameters and 44.8% of floating-point calculations, and the accuracy of the lightweight model after pruning is improved by 0.25%. Compared with the unpruned neural network model, the method of this embodiment achieves significant model compression and acceleration, and can achieve better model accuracy than other similar methods at the same or even higher pruning rate.

与现有技术相比,本实施例提供的一种基于剪枝轻量化模型的包裹识别方法,将无向图和冯·诺依曼图熵等理论应用至神经网络剪枝中,将神经网络模型的特征图矩阵转换为带权无向图,同时考虑特征图的幅值和相关性,计算删除单个通道后的冯·诺依曼图熵变化值,以此作为剪枝依据,以保证剪枝后轻量化模型的性能,实现了显著的模型压缩与加速;特征图矩阵的冯·诺依曼图熵具有良好的稳定性,与输入训练图片的数量相关性较小,只需要少量训练图片即可得到准确的特征图通道重要性,使得技术方案剪枝效率高、内存占用低以及剪枝结果稳定;本实施例中的通道剪枝方法适用于由具有CONV-BN-ReLU结构的卷积层构成的神经网络模型,在保持性能最大化的同时,尽可能缩减其参数量和运算量,实现神经网络模型轻量化,将压缩后的轻量化深度神经网络模型部署在资源受限的边缘设备,扩大了使用范围,实现了应用价值最大化。Compared with the prior art, a package identification method based on a pruning lightweight model provided in this embodiment applies theories such as undirected graph and von Neumann graph entropy to neural network pruning, and the neural network The feature map matrix of the model is converted into a weighted undirected map, and the amplitude and correlation of the feature map are considered to calculate the change value of the von Neumann map entropy after deleting a single channel, which is used as the basis for pruning to ensure pruning. The performance of the post-branch lightweight model achieves significant model compression and acceleration; the von Neumann map entropy of the feature map matrix has good stability, and has little correlation with the number of input training pictures, only a small number of training pictures are required Accurate feature map channel importance can be obtained, so that the technical solution has high pruning efficiency, low memory usage and stable pruning results; the channel pruning method in this embodiment is suitable for convolution with CONV-BN-ReLU structure. The neural network model composed of layers, while maintaining the maximum performance, reduces the amount of parameters and computation as much as possible, realizes the lightweight of the neural network model, and deploys the compressed lightweight deep neural network model on edge devices with limited resources. , expanding the scope of use and maximizing the application value.

实施例二Embodiment 2

本发明的另一个实施例,公开了一种基于剪枝轻量化模型的包裹识别系统,从而实现实施例一中的包裹识别方法。各模块的具体实现方式参照实施例一中的相应描述。如图3所示,该系统包括:Another embodiment of the present invention discloses a package identification system based on a pruning lightweight model, thereby realizing the package identification method in the first embodiment. For the specific implementation of each module, refer to the corresponding description in the first embodiment. As shown in Figure 3, the system includes:

特征图矩阵提取模块S201,用于将训练图片输入已预训练的待剪枝神经网络模型,提取各卷积层的特征图矩阵;The feature map matrix extraction module S201 is used to input the training picture into the pre-trained neural network model to be pruned, and extract the feature map matrix of each convolution layer;

冯·诺依曼图熵计算模块S202,用于将各卷积层的特征图矩阵转换为带权无向图,根据特征图矩阵的幅值矩阵构建改进后的拉普拉斯矩阵,并计算出冯·诺依曼图熵作为各卷积层的原始值;依次删除各卷积层的带权无向图中单个顶点得到新的带权无向图,计算出每个新的带权无向图的冯·诺依曼图熵相对于原始值的变化值;The von Neumann graph entropy calculation module S202 is used to convert the feature map matrix of each convolution layer into a weighted undirected graph, construct an improved Laplacian matrix according to the magnitude matrix of the feature map matrix, and calculate The von Neumann graph entropy is taken as the original value of each convolutional layer; the single vertex in the weighted undirected graph of each convolutional layer is deleted in turn to obtain a new weighted undirected graph, and each new weighted undirected graph is calculated. the change in the von Neumann graph entropy of the graph relative to the original value;

通道剪枝模块S203,用于根据各卷积层中冯·诺依曼图熵的变化值,计算各卷积层的通道的重要性,并根据各卷积层的剪枝率和通道的重要性,对各卷积层的通道进行剪枝,训练剪枝后的神经网络模型,得到剪枝轻量化模型;The channel pruning module S203 is used to calculate the importance of the channel of each convolutional layer according to the change value of the von Neumann graph entropy in each convolutional layer, and according to the pruning rate of each convolutional layer and the importance of the channel prune the channels of each convolutional layer, train the pruned neural network model, and obtain a pruned lightweight model;

包裹识别模块S204,用于将剪枝轻量化模型部署至包裹识别终端设备,对实时采集的图片识别出其中的包裹信息。The package identification module S204 is configured to deploy the pruning lightweight model to the package identification terminal device, and identify the package information in the real-time collected pictures.

由于本实施例的基于剪枝轻量化模型的包裹识别系统与前述识别方法相关之处可相互借鉴,此处为重复描述,故这里不再赘述。凡是本发明实施例中的方法所采用的系统都应涵盖在本发明的保护范围之内。由于本系统实施例与上述方法实施例原理相同,所以本系统也具有上述方法实施例相应的技术效果。Since the package identification system based on the pruning lightweight model in this embodiment and the aforementioned identification method can learn from each other, the description is repeated here, so it will not be repeated here. All systems used in the methods in the embodiments of the present invention should be covered within the protection scope of the present invention. Since the principles of the present system embodiments are the same as those of the foregoing method embodiments, the present system also has the corresponding technical effects of the foregoing method embodiments.

本领域技术人员可以理解,实现上述实施例方法的全部或部分流程,可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于计算机可读存储介质中。其中,所述计算机可读存储介质为磁盘、光盘、只读存储记忆体或随机存储记忆体等。Those skilled in the art can understand that all or part of the process of implementing the methods in the above embodiments can be completed by instructing relevant hardware through a computer program, and the program can be stored in a computer-readable storage medium. Wherein, the computer-readable storage medium is a magnetic disk, an optical disk, a read-only storage memory, or a random-access storage memory, or the like.

以上所述,仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention, but the protection scope of the present invention is not limited to this. Substitutions should be covered within the protection scope of the present invention.

Claims (6)

1. A parcel identification method based on a pruning lightweight model is characterized by comprising the following steps:
inputting the training picture into a pre-trained neural network model to be pruned, and extracting a characteristic diagram matrix of each convolution layer; the training pictures are obtained by randomly extracting the training pictures from the wrapped picture data set;
converting the characteristic diagram matrix of each convolution layer into a weighted undirected graph, constructing an improved Laplace matrix according to the amplitude matrix of the characteristic diagram matrix, and calculating von Neumann diagram entropy as an original value of each convolution layer, wherein the method comprises the following steps:
deforming each three-dimensional characteristic diagram in the characteristic diagram matrix of each convolution layer to obtain a characteristic row vector matrix, wherein each row is a characteristic row vector corresponding to each channel and is respectively used as the vertex of a weighted undirected graph of each convolution layer;
calculating the cosine distance of any two characteristic row vectors in the characteristic row vector matrix of each convolution layer, and taking the cosine distance as the edge weight between two corresponding vertexes in the weighted undirected graph of each convolution layer;
acquiring an adjacency matrix and a degree matrix according to the weighted undirected graph;
obtaining a magnitude matrix according to the feature map matrix, wherein the magnitude matrix is a diagonal matrix, and each diagonal element is the sum of squares of all elements in corresponding channels in the feature map matrix;
multiplying the degree matrix by the amplitude matrix, and then subtracting the adjacent matrix to obtain an improved Laplace matrix;
the von Neumann map entropy is calculated by the following formula:
Figure 9221DEST_PATH_IMAGE001
wherein,H i is shown asiVon Neumann entropy of each convolutional layer,L d representing the improved Laplace matrix; trace (·) represents the trace of the matrix, i.e., the sum of all eigenvalues of the matrix;λ k representing the modified Laplace matrixL d To middlekThe value of the characteristic is used as the characteristic value,λ k is not less than 0, and
Figure 935589DEST_PATH_IMAGE002
n i is shown asiThe number of channels of each convolutional layer;
deleting single vertexes in the weighted undirected graph of each convolution layer in sequence to obtain a new weighted undirected graph, and calculating the change value of von Neumann diagram entropy of each new weighted undirected graph relative to the original value, wherein the change value comprises the following steps:
sequentially deleting the characteristic row vectors corresponding to a single vertex aiming at the weighted undirected graph of each convolution layer, reconverting the characteristic row vector matrix obtained after deletion each time into a new weighted undirected graph, constructing a new improved Laplace matrix, and calculating a new von Neumann graph entropy;
calculating the absolute value of the difference value between each new von Neumann diagram entropy and the original value of the corresponding convolution layer to be used as the change value of the von Neumann diagram entropy of the channel corresponding to the deleted vertex;
calculating the importance of the channel of each convolutional layer according to the variation value of the von Neumann diagram entropy in each convolutional layer, pruning the channel of each convolutional layer according to the pruning rate of each convolutional layer and the importance of the channel, and training the pruned neural network model to obtain a pruning lightweight model;
and deploying the pruning lightweight model to a package recognition terminal device, and recognizing package information in the real-time collected pictures.
2. The package identification method based on the pruning lightweight model according to claim 1, wherein the adjacency matrix is a real symmetric matrix, the off-diagonal elements of the adjacency matrix are edge weights between two corresponding vertices, and the diagonal elements are 0; the degree matrix is a diagonal matrix with each row of diagonal elements being the sum of all elements of the corresponding row in the adjacency matrix.
3. The package recognition method based on the pruning lightweight model according to claim 1, wherein the calculating the importance of the channel of each convolution layer according to the variation value of the von Neumann map entropy in each convolution layer comprises:
and calculating the average value of the von Neumann diagram entropy change values of each channel according to the von Neumann diagram entropy change values of the channels of the convolutional layers obtained by each training picture, wherein the average value is used as the importance of each channel.
4. The package identification method based on the pruning lightweight model according to claim 3, wherein the pruning of the channels of each convolutional layer according to the pruning rate of each convolutional layer and the importance of the channels is performed from small to large according to the pruning rate of each convolutional layer and the importance of the channels of each convolutional layer.
5. The package identification method based on the pruning lightweight model according to claim 1 or 4, wherein the neural network model to be pruned is a neural network model composed of convolutional layers having a CONV-BN-ReLU structure.
6. A parcel recognition system based on a pruning lightweight model is characterized by comprising:
the characteristic diagram matrix extraction module is used for inputting the training picture into the pre-trained neural network model to be pruned and extracting the characteristic diagram matrix of each convolution layer; the training pictures are obtained by randomly extracting the training pictures from the package picture data set;
the von Neumann map entropy calculation module is used for converting the characteristic map matrix of each convolution layer into a weighted undirected graph, constructing an improved Laplace matrix according to the amplitude matrix of the characteristic map matrix, and calculating the von Neumann map entropy as an original value of each convolution layer, and comprises the following steps:
deforming each three-dimensional characteristic diagram in the characteristic diagram matrix of each convolution layer to obtain a characteristic row vector matrix, wherein each row is a characteristic row vector corresponding to each channel and is respectively used as the vertex of a weighted undirected graph of each convolution layer;
calculating the cosine distance of any two characteristic row vectors in the characteristic row vector matrix of each convolution layer, and taking the cosine distance as the edge weight between two corresponding vertexes in the weighted undirected graph of each convolution layer;
acquiring an adjacency matrix and a degree matrix according to the weighted undirected graph;
obtaining a magnitude matrix according to the feature map matrix, wherein the magnitude matrix is a diagonal matrix, and each diagonal element is the sum of squares of all elements in a corresponding channel in the feature map matrix;
multiplying the degree matrix by the amplitude matrix, and then subtracting the adjacent matrix to obtain an improved Laplace matrix;
the von Neumann map entropy is calculated by the following formula:
Figure 281120DEST_PATH_IMAGE001
wherein,H i is shown asiVon Neumann entropy of each convolutional layer,L d representing the improved Laplace matrix; trace (·) represents the trace of the matrix, i.e., the sum of all eigenvalues of the matrix;λ k representing an improved Laplace matrixL d To middlekThe value of the characteristic is used as the characteristic value,λ k not less than 0, and
Figure 838003DEST_PATH_IMAGE002
n i is shown asiThe number of channels of each convolutional layer;
sequentially deleting single vertexes in the weighted undirected graph of each convolution layer to obtain a new weighted undirected graph, and calculating the variation value of the von Neumann graph entropy of each new weighted undirected graph relative to the original value, wherein the variation value comprises the following steps:
sequentially deleting the characteristic row vectors corresponding to a single vertex aiming at the weighted undirected graph of each convolution layer, reconverting the characteristic row vector matrix obtained after deletion each time into a new weighted undirected graph, constructing a new improved Laplace matrix, and calculating a new Von Neumann graph entropy;
calculating the absolute value of the difference value between each new von Neumann diagram entropy and the original value of the corresponding convolution layer, and taking the absolute value as the variation value of the von Neumann diagram entropy of the channel corresponding to the deleted vertex;
the channel pruning module is used for calculating the importance of the channel of each convolutional layer according to the variation value of the von Neumann diagram entropy in each convolutional layer, pruning the channel of each convolutional layer according to the pruning rate of each convolutional layer and the importance of the channel, training the neural network model after pruning and obtaining a pruning lightweight model;
and the package identification module is used for deploying the pruning lightweight model to package identification terminal equipment and identifying package information in the real-time acquired pictures.
CN202210447343.7A 2022-04-27 2022-04-27 A package identification method and system based on pruning lightweight model Active CN114548884B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210447343.7A CN114548884B (en) 2022-04-27 2022-04-27 A package identification method and system based on pruning lightweight model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210447343.7A CN114548884B (en) 2022-04-27 2022-04-27 A package identification method and system based on pruning lightweight model

Publications (2)

Publication Number Publication Date
CN114548884A CN114548884A (en) 2022-05-27
CN114548884B true CN114548884B (en) 2022-07-12

Family

ID=81667083

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210447343.7A Active CN114548884B (en) 2022-04-27 2022-04-27 A package identification method and system based on pruning lightweight model

Country Status (1)

Country Link
CN (1) CN114548884B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105139430A (en) * 2015-08-27 2015-12-09 哈尔滨工程大学 Medical image clustering method based on entropy
WO2016083620A1 (en) * 2014-11-28 2016-06-02 Imec Taiwan Co. Autofocus system and method in digital holography
CN109886397A (en) * 2019-03-21 2019-06-14 西安交通大学 A neural network structured pruning and compression optimization method for convolutional layers
CN112508225A (en) * 2020-10-27 2021-03-16 宁波工程学院 Multi-detail traffic cell partitioning method and system based on spectral clustering algorithm
CN113807466A (en) * 2021-10-09 2021-12-17 中山大学 Logistics package autonomous detection method based on deep learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210303536A1 (en) * 2020-03-19 2021-09-30 NEC Laboratories Europe GmbH Methods and systems for graph approximation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016083620A1 (en) * 2014-11-28 2016-06-02 Imec Taiwan Co. Autofocus system and method in digital holography
CN105139430A (en) * 2015-08-27 2015-12-09 哈尔滨工程大学 Medical image clustering method based on entropy
CN109886397A (en) * 2019-03-21 2019-06-14 西安交通大学 A neural network structured pruning and compression optimization method for convolutional layers
CN112508225A (en) * 2020-10-27 2021-03-16 宁波工程学院 Multi-detail traffic cell partitioning method and system based on spectral clustering algorithm
CN113807466A (en) * 2021-10-09 2021-12-17 中山大学 Logistics package autonomous detection method based on deep learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Exploring the heterogeneity for node importance by von Neumann;Feng X, et al;《Physica A: Statistical Mechanics and its Applications》;20191231;全文 *
两类图熵的极值特性及在复杂网络中的应用研究;薛玉龙;《中国优秀博硕士学位论文全文数据库(硕士)基础科学辑》;20201215(第12期);全文 *
复杂网络结构特性及其鲁棒性研究;董璊;《中国优秀博硕士学位论文全文数据库(硕士)基础科学辑》;20190915(第9期);全文 *

Also Published As

Publication number Publication date
CN114548884A (en) 2022-05-27

Similar Documents

Publication Publication Date Title
CN113128355B (en) Unmanned aerial vehicle image real-time target detection method based on channel pruning
CN116580257B (en) Feature fusion model training and sample retrieval method, device and computer equipment
CN113705580B (en) Hyperspectral image classification method based on deep migration learning
CN111507521B (en) Electric power load forecasting method and forecasting device in Taiwan area
CN109299716A (en) Training method, image partition method, device, equipment and the medium of neural network
CN112052893A (en) Generative Adversarial Network-Based Semi-Supervised Image Classification Method
US20240143977A1 (en) Model training method and apparatus
CN108960059A (en) A kind of video actions recognition methods and device
CN109784283A (en) Remote sensing image target extraction method based on scene recognition task
CN112668630B (en) A lightweight image classification method, system and device based on model pruning
CN109711422A (en) Image data processing, model establishment method, device, computer equipment and storage medium
CN110135227B (en) Laser point cloud outdoor scene automatic segmentation method based on machine learning
CN113313119B (en) Image recognition method, device, equipment, medium and product
CN110032925A (en) A kind of images of gestures segmentation and recognition methods based on improvement capsule network and algorithm
CN114913379B (en) Remote sensing image small sample scene classification method based on multitasking dynamic contrast learning
CN114419570A (en) Point cloud data identification method and device, electronic equipment and storage medium
CN110647990A (en) A tailoring method of deep convolutional neural network model based on grey relational analysis
CN110008853A (en) Pedestrian detection network and model training method, detection method, medium, equipment
CN114359167A (en) A lightweight YOLOv4-based insulator defect detection method in complex scenarios
Xu et al. Attention-based contrastive learning for few-shot remote sensing image classification
CN114998757A (en) Target detection method for UAV aerial image analysis
CN110457677A (en) Entity-relationship recognition method and device, storage medium, computer equipment
CN113269224A (en) Scene image classification method, system and storage medium
CN113327227B (en) A fast detection method of wheat head based on MobilenetV3
CN113807363A (en) Image classification method based on lightweight residual error network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant