WO2022126902A1 - Model compression method and apparatus, electronic device, and medium - Google Patents

Model compression method and apparatus, electronic device, and medium Download PDF

Info

Publication number
WO2022126902A1
WO2022126902A1 PCT/CN2021/083080 CN2021083080W WO2022126902A1 WO 2022126902 A1 WO2022126902 A1 WO 2022126902A1 CN 2021083080 W CN2021083080 W CN 2021083080W WO 2022126902 A1 WO2022126902 A1 WO 2022126902A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
compressed
model
preset
loss value
Prior art date
Application number
PCT/CN2021/083080
Other languages
French (fr)
Chinese (zh)
Inventor
成冠举
李葛
曾婵
高鹏
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022126902A1 publication Critical patent/WO2022126902A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T90/00Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation

Definitions

  • the long short-term memory network can train the mapping of the random noise from Gaussian distribution to fitting distribution, and at the same time, in order to prevent the occurrence of over-fitting, a dropout mechanism is added to each layer of neural network of the long short-term memory network.
  • the activation function may be a tanh function, and the tanh function is used to compress the data in the fitting data set between -1 and 1, so that the vectorization operation can be performed subsequently.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A model compression method and apparatus, an electronic device and a storage medium, which relate to data processing technology. The method comprises: performing data fitting on random noise data by using a pre-constructed fitter to obtain simulation data; calculating activation loss values between the simulation data and the noise data, adjusting parameters of the fitter when the activation loss values are greater than a preset activation threshold until the activation loss values are less than or equal to the preset activation threshold; inputting the simulation data into a model to be compressed to obtain output data; and calculating sparse loss values between the output data and the simulation data, and adjusting internal parameters of the fitter when the sparse loss values are greater than a preset sparse threshold until the sparse loss values are less than or equal to the preset sparse threshold; and outputting the simulation data and compressing the model to obtain a compressed model. The method, apparatus, electronic device and storage medium can achieve model compression without acquiring training data, network structures and parameters.

Description

模型压缩方法、装置、电子设备及介质Model compression method, device, electronic device and medium
本申请要求于2020年12月18日提交中国专利局、申请号为202011501677.5,发明名称为“模型压缩方法、装置、电子设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202011501677.5 and the invention titled "Model Compression Method, Apparatus, Electronic Device and Medium" filed with the China Patent Office on December 18, 2020, the entire contents of which are incorporated by reference in in this application.
技术领域technical field
本申请涉及数据处理领域,尤其涉及一种模型压缩方法、装置、电子设备及计算机可读存储介质。The present application relates to the field of data processing, and in particular, to a model compression method, apparatus, electronic device, and computer-readable storage medium.
背景技术Background technique
大数据时代深度学习模型运用的越来越频繁,为了将深度学习模型应用到移动设备、传感器等小型设备,有时必须将深度学习模型进行压缩裁剪才能部署到小型设备。In the era of big data, deep learning models are used more and more frequently. In order to apply deep learning models to small devices such as mobile devices and sensors, sometimes deep learning models must be compressed and trimmed before they can be deployed to small devices.
发明人意识到,目前主流的深度学习压缩方法都要基于原始训练数据集、网络结构、参数等进行模型的压缩,如知识蒸馏方法和基于元数据的方法,前者需要大量的原始训练数据,而后者需要模型的网络结构和参数,但由于法律、隐私等原因,训练数据、网络结构和参数通常很难获取到。The inventor realizes that the current mainstream deep learning compression methods need to compress models based on the original training data set, network structure, parameters, etc., such as the knowledge distillation method and the metadata-based method, the former requires a large amount of original training data, and then However, due to legal, privacy and other reasons, training data, network structure and parameters are usually difficult to obtain.
发明内容SUMMARY OF THE INVENTION
本申请提供的一种模型压缩方法,包括:A model compression method provided by this application includes:
利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据;Use a pre-built fitter to perform data fitting operation on random noise data to obtain simulated data;
利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,在所述激活损失值大于预设的激活阈值时,调整所述拟合器的参数并返回利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,直到所述激活损失值小于或等于预设的激活阈值时,将所述仿真数据输入至待压缩模型中,得到输出数据;Use a preset first loss function to calculate the activation loss value between the simulation data and the noise data, when the activation loss value is greater than a preset activation threshold, adjust the parameters of the fitter and return to using The pre-built fitter performs a data fitting operation on random noise data to obtain simulation data, and until the activation loss value is less than or equal to a preset activation threshold, the simulation data is input into the model to be compressed, and the output is obtained data;
利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值,在所述稀疏损失值大于预设的稀疏阈值时,调整所述拟合器的内部参数并返回利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,直到所述稀疏损失值小于或者等于预设的稀疏阈值时,输出所述仿真数据;Use a preset second loss function to calculate the sparse loss value between the output data and the simulation data, when the sparse loss value is greater than the preset sparse threshold, adjust the internal parameters of the fitter and return Use a pre-built fitter to perform a data fitting operation on random noise data to obtain simulation data, and output the simulation data until the sparse loss value is less than or equal to a preset sparse threshold;
根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型。The to-be-compressed model is compressed according to the simulation data to obtain a compressed model.
本申请还提供一种模型压缩装置,所述装置包括:The present application also provides a model compression device, the device comprising:
数据拟合模块,用于利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据;The data fitting module is used to perform data fitting operation on random noise data by using a pre-built fitter to obtain simulation data;
激活损失模块,用于利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,在所述激活损失值大于预设的激活阈值时,调整所述拟合器的参数,直到所述激活损失值小于或等于预设的激活阈值时,将所述仿真数据输入至待压缩模型中,得到输出数据;an activation loss module, configured to use a preset first loss function to calculate an activation loss value between the simulation data and the noise data, and adjust the fitting when the activation loss value is greater than a preset activation threshold until the activation loss value is less than or equal to the preset activation threshold, input the simulation data into the model to be compressed to obtain output data;
稀疏损失模块,用于利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值,在所述稀疏损失值大于预设的稀疏阈值时,调整所述拟合器的内部参数,直到所述稀疏损失值小于或者等于预设的稀疏阈值时,输出所述仿真数据;a sparse loss module, configured to use a preset second loss function to calculate a sparse loss value between the output data and the simulation data, and adjust the fitting when the sparse loss value is greater than a preset sparse threshold the internal parameters of the generator, until the sparse loss value is less than or equal to the preset sparse threshold, output the simulation data;
模型压缩模块,用于根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型。A model compression module, configured to perform compression processing on the to-be-compressed model according to the simulation data to obtain a compressed model.
本申请还提供一种电子设备,所述电子设备包括:The present application also provides an electronic device, the electronic device comprising:
至少一个处理器;以及,at least one processor; and,
与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如下所述的模型压缩方法:The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform a model compression method as described below:
利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据;Use a pre-built fitter to perform data fitting operation on random noise data to obtain simulated data;
利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,在所述激活损失值大于预设的激活阈值时,调整所述拟合器的参数并返回利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,直到所述激活损失值小于或等于预设的激活阈值时,将所述仿真数据输入至待压缩模型中,得到输出数据;Use a preset first loss function to calculate the activation loss value between the simulation data and the noise data, when the activation loss value is greater than a preset activation threshold, adjust the parameters of the fitter and return to using The pre-built fitter performs a data fitting operation on random noise data to obtain simulation data, and until the activation loss value is less than or equal to a preset activation threshold, the simulation data is input into the model to be compressed, and the output is obtained data;
利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值,在所述稀疏损失值大于预设的稀疏阈值时,调整所述拟合器的内部参数并返回利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,直到所述稀疏损失值小于或者等于预设的稀疏阈值时,输出所述仿真数据;Use a preset second loss function to calculate the sparse loss value between the output data and the simulation data, when the sparse loss value is greater than the preset sparse threshold, adjust the internal parameters of the fitter and return Use a pre-built fitter to perform a data fitting operation on random noise data to obtain simulation data, and output the simulation data until the sparse loss value is less than or equal to a preset sparse threshold;
根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型。The to-be-compressed model is compressed according to the simulation data to obtain a compressed model.
本申请还提供一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现如下所述的模型压缩方法:The present application also provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, implements the following model compression method:
利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据;Use a pre-built fitter to perform data fitting operation on random noise data to obtain simulated data;
利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,在所述激活损失值大于预设的激活阈值时,调整所述拟合器的参数并返回利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,直到所述激活损失值小于或等于预设的激活阈值时,将所述仿真数据输入至待压缩模型中,得到输出数据;Use a preset first loss function to calculate the activation loss value between the simulation data and the noise data, when the activation loss value is greater than a preset activation threshold, adjust the parameters of the fitter and return to using The pre-built fitter performs a data fitting operation on random noise data to obtain simulation data, and until the activation loss value is less than or equal to a preset activation threshold, the simulation data is input into the model to be compressed to obtain an output data;
利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值,在所述稀疏损失值大于预设的稀疏阈值时,调整所述拟合器的内部参数并返回利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,直到所述稀疏损失值小于或者等于预设的稀疏阈值时,输出所述仿真数据;Use a preset second loss function to calculate the sparse loss value between the output data and the simulation data, when the sparse loss value is greater than the preset sparse threshold, adjust the internal parameters of the fitter and return Use a pre-built fitter to perform a data fitting operation on random noise data to obtain simulation data, and output the simulation data until the sparse loss value is less than or equal to a preset sparse threshold;
根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型。The to-be-compressed model is compressed according to the simulation data to obtain a compressed model.
附图说明Description of drawings
图1为本申请一实施例提供的模型压缩方法的流程示意图;FIG. 1 is a schematic flowchart of a model compression method provided by an embodiment of the present application;
图2为本申请一实施例提供的模型压缩装置的模块示意图;FIG. 2 is a schematic block diagram of a model compression apparatus provided by an embodiment of the present application;
图3为本申请一实施例提供的实现模型压缩方法的电子设备的内部结构示意图;3 is a schematic diagram of the internal structure of an electronic device for implementing a model compression method provided by an embodiment of the present application;
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional characteristics and advantages of the purpose of the present application will be further described with reference to the accompanying drawings in conjunction with the embodiments.
具体实施方式Detailed ways
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.
本申请实施例提供一种模型压缩方法。所述模型压缩方法的执行主体包括但不限于服务端、终端等能够被配置为执行本申请实施例提供的该方法的电子设备中的至少一种。换言之,所述模型压缩方法可以由安装在终端设备或服务端设备的软件或硬件来执行,所述软件可以是区块链平台。所述服务端包括但不限于:单台服务器、服务器集群、云端服务器或云端服务器集群等。The embodiment of the present application provides a model compression method. The execution body of the model compression method includes, but is not limited to, at least one of electronic devices that can be configured to execute the method provided by the embodiments of the present application, such as a server and a terminal. In other words, the model compression method can be executed by software or hardware installed in a terminal device or a server device, and the software can be a blockchain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
参照图1所示,为本申请实施例提供的一种模型压缩方法的流程示意图。在本实施例中,所述模型压缩方法包括:Referring to FIG. 1 , a schematic flowchart of a model compression method provided by an embodiment of the present application is shown. In this embodiment, the model compression method includes:
S1、利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据。S1. Use a pre-built fitter to perform a data fitting operation on random noise data to obtain simulation data.
本申请实施例中,所述随机噪声数据是从高斯分布中采样得到的随机高斯噪音。所述拟合器是将噪声数据不断进行线性拟合处理,生成逼近于真实数据的仿真数据。In this embodiment of the present application, the random noise data is random Gaussian noise sampled from a Gaussian distribution. The fitter continuously performs linear fitting processing on the noise data to generate simulation data that is close to the real data.
具体地,所述利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,包括:Specifically, performing a data fitting operation on random noise data using a pre-built fitter to obtain simulation data, including:
利用所述拟合器中的长短期记忆网络对所述噪声数据进行预测,得到拟合数据集;Using the long short-term memory network in the fitter to predict the noise data to obtain a fitting data set;
利用激活函数对所述拟合数据集进行压缩,得到压缩数据集;Compress the fitted data set by using an activation function to obtain a compressed data set;
对所述压缩数据集进行向量化处理,得到仿真数据。Perform vectorization processing on the compressed data set to obtain simulation data.
其中,所述长短期记忆网络可以训练所述随机噪音从高斯分布到拟合分布的映射,同时为了防止过拟合的发生,所述长短期记忆网络的每一层神经网络会增加dropout机制。所述激活函数可以是tanh函数,利用所述tanh函数将所述拟合数据集中的数据压缩到-1到1之间,以便后续进行向量化操作。Wherein, the long short-term memory network can train the mapping of the random noise from Gaussian distribution to fitting distribution, and at the same time, in order to prevent the occurrence of over-fitting, a dropout mechanism is added to each layer of neural network of the long short-term memory network. The activation function may be a tanh function, and the tanh function is used to compress the data in the fitting data set between -1 and 1, so that the vectorization operation can be performed subsequently.
进一步地,所述对所述压缩数据集进行向量化处理,得到仿真数据,包括:Further, performing vectorization processing on the compressed data set to obtain simulation data, including:
利用Word2Vec算法将所述压缩数据集中的压缩数据映射为特征向量;Utilize the Word2Vec algorithm to map the compressed data in the compressed data set into a feature vector;
按照所述特征向量的序列对所述特征向量进行拼接,得到所述仿真数据。The eigenvectors are spliced according to the sequence of the eigenvectors to obtain the simulation data.
其中,所述Word2Vec算法可以将数据映射为统一维度的向量,所述Word2Vec算法适用于在对于一个序列的数据且序列局部数据间存在着很强的关联的情况,可以用来对数据进行更泛化的分析。Among them, the Word2Vec algorithm can map the data into a vector of uniform dimension, and the Word2Vec algorithm is suitable for the situation that there is a strong correlation between the data of a sequence and the local data of the sequence, and can be used to perform more generalization on the data. analysis.
详细地,利用预先构建的拟合器对随机噪声数据进行数据拟合操作,可以得到一个与所述随机噪声数据接近的仿真数据,用于代替所述随机噪声数据进行后续的模型压缩。In detail, by using a pre-built fitter to perform a data fitting operation on the random noise data, a simulation data close to the random noise data can be obtained, which can be used to perform subsequent model compression in place of the random noise data.
S2、利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值。S2. Calculate an activation loss value between the simulation data and the noise data by using a preset first loss function.
本申请实施例中,所述第一损失函数:In the embodiment of the present application, the first loss function:
Figure PCTCN2021083080-appb-000001
Figure PCTCN2021083080-appb-000001
其中,
Figure PCTCN2021083080-appb-000002
为所述激活损失值,n为所述噪声数据的样本数,
Figure PCTCN2021083080-appb-000003
为所述仿真数据中的第m个数据,|| ||1是L1范数。L1范数主要是为了获得稀疏性,加上负号是为了尽量不稀疏,让
Figure PCTCN2021083080-appb-000004
尽可能多的被激活。
in,
Figure PCTCN2021083080-appb-000002
is the activation loss value, n is the number of samples of the noise data,
Figure PCTCN2021083080-appb-000003
is the mth data in the simulation data, || ||1 is the L1 norm. The L1 norm is mainly to obtain sparsity, and the negative sign is added to try not to be sparse, let
Figure PCTCN2021083080-appb-000004
as many as possible.
在所述激活损失值大于预设的激活阈值时,本申请实施例调整所述拟合器的参数并返回上述的S1,重新利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据。When the activation loss value is greater than the preset activation threshold, the embodiment of the present application adjusts the parameters of the fitter and returns to the above S1, and re-uses the pre-built fitter to perform a data fitting operation on random noise data, Get simulation data.
优选地,所述拟合器的参数可以是拟合器的权重、梯度等。Preferably, the parameters of the fitter may be weights, gradients and the like of the fitter.
在所述激活损失值小于或等于预设的激活阈值时,执行S3、将所述仿真数据输入至待压缩模型中,得到输出数据。When the activation loss value is less than or equal to a preset activation threshold, perform S3, input the simulation data into the model to be compressed, and obtain output data.
其中,所述第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,将所述激活损失值和预设的激活阈值进行比较,进而调整所述拟合器的参数,直至所述仿真数据和所述噪声数据之间的激活损失值收敛,此时调整得到的拟合器符合标准,无需再调整其参数。Wherein, the first loss function calculates the activation loss value between the simulation data and the noise data, compares the activation loss value with a preset activation threshold, and then adjusts the parameters of the fitter, Until the activation loss value between the simulation data and the noise data converges, at this time, the adjusted fitter meets the standard, and it is not necessary to adjust its parameters.
S4、利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值。S4. Calculate a sparse loss value between the output data and the simulation data by using a preset second loss function.
本申请实施例中,所述第二损失函数可以是In this embodiment of the present application, the second loss function may be
Figure PCTCN2021083080-appb-000005
Figure PCTCN2021083080-appb-000005
其中,
Figure PCTCN2021083080-appb-000006
为所述稀疏损失值,x为所述仿真数据的样本数,
Figure PCTCN2021083080-appb-000007
是所述输出数据中的第m个数据,t m为预设的参数,
Figure PCTCN2021083080-appb-000008
为softmax损失函数。
in,
Figure PCTCN2021083080-appb-000006
is the sparse loss value, x is the number of samples of the simulated data,
Figure PCTCN2021083080-appb-000007
is the mth data in the output data, t m is a preset parameter,
Figure PCTCN2021083080-appb-000008
is the softmax loss function.
在所述稀疏损失值大于预设的稀疏阈值时,本申请实施例调整所述拟合器的内部参数并返回上述的S1,利用预先构建的拟合器重新对随机噪声数据进行数据拟合操作,得到仿真数据。When the sparse loss value is greater than the preset sparse threshold, the embodiment of the present application adjusts the internal parameters of the fitter and returns to the above S1, and uses the pre-built fitter to perform a data fitting operation on the random noise data again , to get the simulation data.
在所述稀疏损失值小于或者等于预设的稀疏阈值时,执行S5,输出所述仿真数据,并根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型。When the sparse loss value is less than or equal to a preset sparse threshold, perform S5, output the simulation data, and perform compression processing on the to-be-compressed model according to the simulation data to obtain a compressed model.
本申请实施例中,所述根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型,包括:In the embodiment of the present application, performing compression processing on the to-be-compressed model according to the simulation data to obtain a compressed model includes:
将所述仿真数据输入至预设的标准压缩模型中进行向量运算,得到所述标准压缩模型输出的第一特征,将所述仿真数据输入至所述待压缩模型中进行向量运算,得到所述待压缩模型输出的第二特征;Inputting the simulation data into a preset standard compression model to perform vector operations to obtain the first feature output by the standard compression model, and inputting the simulation data into the to-be-compressed model to perform vector operations to obtain the the second feature output by the model to be compressed;
根据所述第一特征和所述第二特征确定所述待压缩模型的损失函数;Determine the loss function of the to-be-compressed model according to the first feature and the second feature;
根据所述损失函数对所述待压缩模型进行反向传播,得到压缩后的模型。The model to be compressed is back-propagated according to the loss function to obtain a compressed model.
具体地,所述根据所述第一特征和所述第二特征确定所述待压缩模型的损失函数,包括:Specifically, the determining the loss function of the to-be-compressed model according to the first feature and the second feature includes:
根据所述第一特征和所述第二特征进行求差计算,得到差值函数;Perform a difference calculation according to the first feature and the second feature to obtain a difference function;
将所述差值函数进行范数转换处理并求其平方,得到损失函数。The difference function is subjected to norm conversion processing and squared to obtain a loss function.
如图2所示,是本申请模型压缩装置的模块示意图。As shown in FIG. 2 , it is a schematic block diagram of the model compression device of the present application.
本申请所述模型压缩装置100可以安装于电子设备中。根据实现的功能,所述模型压缩装置100可以包括数据拟合模块101、激活损失模块102、稀疏损失模块103、模型压缩模块104。本申请所述模块也可以称之为单元,是指一种能够被电子设备处理器所执行,并且能够完成固定功能的一系列计算机程序段,其存储在电子设备的存储器中。The model compression apparatus 100 described in this application can be installed in an electronic device. According to the implemented functions, the model compression apparatus 100 may include a data fitting module 101 , an activation loss module 102 , a sparse loss module 103 , and a model compression module 104 . The modules described in this application may also be referred to as units, which refer to a series of computer program segments that can be executed by the processor of an electronic device and can perform fixed functions, and are stored in the memory of the electronic device.
在本实施例中,关于各模块/单元的功能如下:In this embodiment, the functions of each module/unit are as follows:
所述数据拟合模块101,用于利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据;The data fitting module 101 is configured to perform a data fitting operation on random noise data by using a pre-built fitter to obtain simulation data;
所述激活损失模块102,用于利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,在所述激活损失值大于预设的激活阈值时,调整所述拟合器的参数,直到所述激活损失值小于或等于预设的激活阈值时,将所述仿真数据输入至待压缩模型中,得到输出数据;The activation loss module 102 is configured to use a preset first loss function to calculate an activation loss value between the simulation data and the noise data, and adjust the activation loss value when the activation loss value is greater than a preset activation threshold. parameters of the fitter, until the activation loss value is less than or equal to a preset activation threshold, input the simulation data into the model to be compressed to obtain output data;
所述稀疏损失模块103,用于利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值,在所述稀疏损失值大于预设的稀疏阈值时,调整所述拟合器的内部参数,直到所述稀疏损失值小于或者等于预设的稀疏阈值时,输出所述仿真数据;The sparse loss module 103 is configured to use a preset second loss function to calculate a sparse loss value between the output data and the simulation data, and adjust the sparse loss value when the sparse loss value is greater than a preset sparse threshold. the internal parameters of the fitter, until the sparse loss value is less than or equal to a preset sparse threshold, output the simulation data;
所述模型压缩模块104,用于根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型。The model compression module 104 is configured to compress the to-be-compressed model according to the simulation data to obtain a compressed model.
详细地,所述模型压缩装置100中的各模块由电子设备的处理器所执行时,可以实现一种模型压缩方法,所述模型压缩方法的具体实施步骤如下:In detail, when each module in the model compression apparatus 100 is executed by a processor of an electronic device, a model compression method can be implemented, and the specific implementation steps of the model compression method are as follows:
步骤一、所述数据拟合模块101利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据。Step 1: The data fitting module 101 uses a pre-built fitter to perform a data fitting operation on random noise data to obtain simulation data.
本申请实施例中,所述随机噪声数据是从高斯分布中采样得到的随机高斯噪音。所述拟合器是将噪声数据不断进行线性拟合处理,生成逼近于真实数据的仿真数据。In this embodiment of the present application, the random noise data is random Gaussian noise sampled from a Gaussian distribution. The fitter continuously performs linear fitting processing on the noise data to generate simulation data that is close to the real data.
具体地,所述数据拟合模块101利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,包括:Specifically, the data fitting module 101 uses a pre-built fitter to perform a data fitting operation on random noise data to obtain simulation data, including:
利用所述拟合器中的长短期记忆网络对所述噪声数据进行预测,得到拟合数据集;Using the long short-term memory network in the fitter to predict the noise data to obtain a fitting data set;
利用激活函数对所述拟合数据集进行压缩,得到压缩数据集;Compress the fitted data set by using an activation function to obtain a compressed data set;
对所述压缩数据集进行向量化处理,得到仿真数据。Perform vectorization processing on the compressed data set to obtain simulation data.
其中,所述长短期记忆网络可以训练所述随机噪音从高斯分布到拟合分布的映射,同时为了防止过拟合的发生,所述长短期记忆网络的每一层神经网络会增加dropout机制。所述激活函数可以是tanh函数,利用所述tanh函数将所述拟合数据集中的数据压缩到-1到1之间,以便后续进行向量化操作。Wherein, the long short-term memory network can train the mapping of the random noise from Gaussian distribution to fitting distribution, and at the same time, in order to prevent the occurrence of over-fitting, a dropout mechanism is added to each layer of neural network of the long short-term memory network. The activation function may be a tanh function, and the tanh function is used to compress the data in the fitting data set between -1 and 1, so that the vectorization operation can be performed subsequently.
进一步地,所述对所述压缩数据集进行向量化处理,得到仿真数据,包括:Further, performing vectorization processing on the compressed data set to obtain simulation data, including:
利用Word2Vec算法将所述压缩数据集中的压缩数据映射为特征向量;Utilize the Word2Vec algorithm to map the compressed data in the compressed data set into a feature vector;
按照所述特征向量的序列对所述特征向量进行拼接,得到所述仿真数据。The eigenvectors are spliced according to the sequence of the eigenvectors to obtain the simulation data.
步骤二、所述激活损失模块102利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值。Step 2: The activation loss module 102 uses a preset first loss function to calculate an activation loss value between the simulation data and the noise data.
本申请实施例中,所述第一损失函数:In the embodiment of the present application, the first loss function:
Figure PCTCN2021083080-appb-000009
Figure PCTCN2021083080-appb-000009
其中,
Figure PCTCN2021083080-appb-000010
为所述激活损失值,n为所述噪声数据的样本数,
Figure PCTCN2021083080-appb-000011
为所述仿真数据中的第m个数据,|| ||1是L1范数。L1范数主要是为了获得稀疏性,加上负号是为了尽量不稀疏,让
Figure PCTCN2021083080-appb-000012
尽可能多的被激活。
in,
Figure PCTCN2021083080-appb-000010
is the activation loss value, n is the number of samples of the noise data,
Figure PCTCN2021083080-appb-000011
is the mth data in the simulation data, || ||1 is the L1 norm. The L1 norm is mainly to obtain sparsity, and the negative sign is added to try not to be sparse, let
Figure PCTCN2021083080-appb-000012
as many as possible.
在所述激活损失值大于预设的激活阈值时,本申请实施例调整所述拟合器的参数并返回上述的步骤一,重新利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据。When the activation loss value is greater than the preset activation threshold, the embodiment of the present application adjusts the parameters of the fitter and returns to the above step 1, and re-uses the pre-built fitter to perform a data fitting operation on random noise data , to get the simulation data.
优选地,所述拟合器的参数可以是拟合器的权重、梯度等。Preferably, the parameters of the fitter may be weights, gradients and the like of the fitter.
在所述激活损失值小于或等于预设的激活阈值时,执行步骤三、将所述仿真数据输入至待压缩模型中,得到输出数据。When the activation loss value is less than or equal to a preset activation threshold, step 3 is performed to input the simulation data into the model to be compressed to obtain output data.
步骤四、所述稀疏损失模块103利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值。Step 4: The sparse loss module 103 uses a preset second loss function to calculate a sparse loss value between the output data and the simulation data.
本申请实施例中,所述第二损失函数可以是In this embodiment of the present application, the second loss function may be
Figure PCTCN2021083080-appb-000013
Figure PCTCN2021083080-appb-000013
其中,
Figure PCTCN2021083080-appb-000014
为所述稀疏损失值,x为所述仿真数据的样本数,
Figure PCTCN2021083080-appb-000015
是所述输出数据中的第m个数据,t m为预设的参数,
Figure PCTCN2021083080-appb-000016
为softmax损失函数。
in,
Figure PCTCN2021083080-appb-000014
is the sparse loss value, x is the number of samples of the simulated data,
Figure PCTCN2021083080-appb-000015
is the mth data in the output data, t m is a preset parameter,
Figure PCTCN2021083080-appb-000016
is the softmax loss function.
在所述稀疏损失值大于预设的稀疏阈值时,本申请实施例调整所述拟合器的内部参数并返回上述的步骤一,利用预先构建的拟合器重新对随机噪声数据进行数据拟合操作,得到仿真数据。When the sparse loss value is greater than the preset sparse threshold, the embodiment of the present application adjusts the internal parameters of the fitter and returns to the above-mentioned step 1, and uses the pre-built fitter to re-fit the random noise data operation to obtain simulation data.
在所述稀疏损失值小于或者等于预设的稀疏阈值时,执行步骤五,输出所述仿真数据,并根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型。When the sparse loss value is less than or equal to a preset sparse threshold, step 5 is performed, the simulation data is output, and the to-be-compressed model is compressed according to the simulation data to obtain a compressed model.
本申请实施例中,所述根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型,包括:In the embodiment of the present application, performing compression processing on the to-be-compressed model according to the simulation data to obtain a compressed model includes:
将所述仿真数据输入至预设的标准压缩模型中进行向量运算,得到所述标准压缩模型输出的第一特征,将所述仿真数据输入至所述待压缩模型中进行向量运算,得到所述待压缩模型输出的第二特征;Inputting the simulation data into a preset standard compression model to perform vector operations to obtain the first feature output by the standard compression model, and inputting the simulation data into the to-be-compressed model to perform vector operations to obtain the the second feature output by the model to be compressed;
根据所述第一特征和所述第二特征确定所述待压缩模型的损失函数;Determine the loss function of the to-be-compressed model according to the first feature and the second feature;
根据所述损失函数对所述待压缩模型进行反向传播,得到压缩后的模型。The model to be compressed is back-propagated according to the loss function to obtain a compressed model.
具体地,所述根据所述第一特征和所述第二特征确定所述待压缩模型的损失函数,包括:Specifically, the determining the loss function of the to-be-compressed model according to the first feature and the second feature includes:
根据所述第一特征和所述第二特征进行求差计算,得到差值函数;Perform a difference calculation according to the first feature and the second feature to obtain a difference function;
将所述差值函数进行范数转换处理并求其平方,得到损失函数。The difference function is subjected to norm conversion processing and squared to obtain a loss function.
如图3所示,是本申请实现模型压缩方法的电子设备的结构示意图。As shown in FIG. 3 , it is a schematic structural diagram of an electronic device implementing the model compression method of the present application.
所述电子设备1可以包括处理器10、存储器11和总线,还可以包括存储在所述存储器11中并可在所述处理器10上运行的计算机程序,如模型压缩程序12。The electronic device 1 may include a processor 10 , a memory 11 and a bus, and may also include a computer program stored in the memory 11 and executable on the processor 10 , such as a model compression program 12 .
其中,所述存储器11至少包括一种类型的可读存储介质,所述可读存储介质可以是易失性的,也可以是非易失性的。具体的,所述可读存储介质包括闪存、移动硬盘、多媒体卡、卡型存储器(例如:SD或DX存储器等)、磁性存储器、磁盘、光盘等。所述存储器11在一些实施例中可以是电子设备1的内部存储单元,例如该电子设备1的移动硬盘。所述存储器11在另一些实施例中也可以是电子设备1的外部存储设备,例如电子设备1上配备的插接式移动硬盘、智能存储卡(SmartMediaCard,SMC)、安全数字(SecureDigital,SD)卡、闪存卡(FlashCard)等。进一步地,所述存储器11还可以既包括电子设备1的内部存储单元也包括外部存储设备。所述存储器11不仅可以用于存储安装于电子设备1的应用软件及各类数据,例如模型压缩程序12的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。Wherein, the memory 11 includes at least one type of readable storage medium, and the readable storage medium may be volatile or non-volatile. Specifically, the readable storage medium includes a flash memory, a mobile hard disk, a multimedia card, a card-type memory (eg, SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like. The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a mobile hard disk of the electronic device 1 . In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, such as a pluggable mobile hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) equipped on the electronic device 1. card, flash memory card (FlashCard), etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 can not only be used to store application software installed in the electronic device 1 and various types of data, such as the code of the model compression program 12, etc., but also can be used to temporarily store data that has been output or will be output.
所述处理器10在一些实施例中可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(CentralProcessingunit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述处理器10是所述电子设备的控制核心(ControlUnit),利用各种接口和线路连接整个电子设备的各个部件,通过运行或执行存储在所述存储器11内的程序或者模块(例如执行模型压缩程序等),以及调用存储在所述存储器11内的数据,以执行电子设备1的各种功能和处理数据。In some embodiments, the processor 10 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits packaged with the same function or different functions, including one or more integrated circuits. Central processing unit (Central Processing unit, CPU), microprocessor, digital processing chip, graphics processor and combination of various control chips, etc. The processor 10 is the control core (ControlUnit) of the electronic device, and uses various interfaces and lines to connect various components of the entire electronic device, and by running or executing programs or modules (such as execution models) stored in the memory 11. Compression program, etc.), and call data stored in the memory 11 to perform various functions of the electronic device 1 and process data.
所述总线可以是外设部件互连标准(peripheralcomponentinterconnect,简称PCI)总线或扩展工业标准结构(extendedindustrystandardarchitecture,简称EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。所述总线被设置为实现所述存储器11以及至少一个处理器10等之间的连接通信。The bus may be a peripheral component interconnect (PCI for short) bus or an extended industry standard architecture (extended industry standard architecture, EISA for short) bus or the like. The bus can be divided into address bus, data bus, control bus and so on. The bus is configured to implement connection communication between the memory 11 and at least one processor 10 and the like.
图3仅示出了具有部件的电子设备,本领域技术人员可以理解的是,图3示出的结构并不构成对所述电子设备1的限定,可以包括比图示更少或者更多的部件,或者组合某些部件,或者不同的部件布置。FIG. 3 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation on the electronic device 1, and may include fewer or more components than those shown in the figure. components, or a combination of certain components, or a different arrangement of components.
例如,尽管未示出,所述电子设备1还可以包括给各个部件供电的电源(比如电池),优选地,电源可以通过电源管理装置与所述至少一个处理器10逻辑相连,从而通过电源管理装置实现充电管理、放电管理、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述电子设备1还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。For example, although not shown, the electronic device 1 may also include a power supply (such as a battery) for powering the various components, preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so that the power management The device implements functions such as charge management, discharge management, and power consumption management. The power source may also include one or more DC or AC power sources, recharging devices, power failure detection circuits, power converters or inverters, power status indicators, and any other components. The electronic device 1 may further include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
进一步地,所述电子设备1还可以包括网络接口,可选地,所述网络接口可以包括有线接口和/或无线接口(如WI-FI接口、蓝牙接口等),通常用于在该电子设备1与其他电子设备之间建立通信连接。Further, the electronic device 1 may also include a network interface, optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.
可选地,该电子设备1还可以包括用户接口,用户接口可以是显示器(Display)、输入单元(比如键盘(Keyboard)),可选地,用户接口还可以是标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(OrganicLight-EmittingDiode,有机发光二极管)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在电子设备1中处理的信息以及用于显示可视化的用户界面。Optionally, the electronic device 1 may further include a user interface, and the user interface may be a display (Display), an input unit (eg, a keyboard (Keyboard)), optionally, the user interface may also be a standard wired interface or a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, and the like. The display may also be appropriately called a display screen or a display unit, which is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.
应该了解,所述实施例仅为说明之用,在专利申请范围上并不受此结构的限制。It should be understood that the embodiments are only used for illustration, and are not limited by this structure in the scope of the patent application.
所述电子设备1中的所述存储器11存储的模型压缩程序12是多个指令的组合,在所述处理器10中运行时,可以实现:The model compression program 12 stored in the memory 11 in the electronic device 1 is a combination of multiple instructions. When running in the processor 10, it can realize:
利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据;Use a pre-built fitter to perform data fitting operation on random noise data to obtain simulated data;
利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,在所述激活损失值大于预设的激活阈值时,调整所述拟合器的参数并返回利用预先构建的拟合 器对随机噪声数据进行数据拟合操作,得到仿真数据,直到所述激活损失值小于或等于预设的激活阈值时,将所述仿真数据输入至待压缩模型中,得到输出数据;Use the preset first loss function to calculate the activation loss value between the simulation data and the noise data, when the activation loss value is greater than the preset activation threshold, adjust the parameters of the fitter and return to using The pre-built fitter performs a data fitting operation on random noise data to obtain simulation data, and until the activation loss value is less than or equal to a preset activation threshold, the simulation data is input into the model to be compressed to obtain an output data;
利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值,在所述稀疏损失值大于预设的稀疏阈值时,调整所述拟合器的内部参数并返回利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,直到所述稀疏损失值小于或者等于预设的稀疏阈值时,输出所述仿真数据;Use a preset second loss function to calculate the sparse loss value between the output data and the simulation data, when the sparse loss value is greater than the preset sparse threshold, adjust the internal parameters of the fitter and return Use a pre-built fitter to perform a data fitting operation on random noise data to obtain simulation data, and output the simulation data until the sparse loss value is less than or equal to a preset sparse threshold;
根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型。The to-be-compressed model is compressed according to the simulation data to obtain a compressed model.
进一步地,所述电子设备1集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读存储介质中。所述计算机可读存储介质可以是易失性的,也可以是非易失性的,例如,所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-OnlyMemory)。Further, if the modules/units integrated in the electronic device 1 are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium. The computer-readable storage medium may be volatile or non-volatile, for example, the computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U Disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory).
本申请还提供一种计算机可读存储介质,所述可读存储介质,所述可读存储介质存储有计算机程序,所述计算机程序在被电子设备的处理器所执行时,可以实现:The present application also provides a computer-readable storage medium. The readable storage medium stores a computer program. When executed by a processor of an electronic device, the computer program can realize:
利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据;Use a pre-built fitter to perform data fitting operation on random noise data to obtain simulated data;
利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,在所述激活损失值大于预设的激活阈值时,调整所述拟合器的参数并返回利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,直到所述激活损失值小于或等于预设的激活阈值时,将所述仿真数据输入至待压缩模型中,得到输出数据;Use a preset first loss function to calculate the activation loss value between the simulation data and the noise data, when the activation loss value is greater than a preset activation threshold, adjust the parameters of the fitter and return to using The pre-built fitter performs a data fitting operation on random noise data to obtain simulation data, and until the activation loss value is less than or equal to a preset activation threshold, the simulation data is input into the model to be compressed, and the output is obtained data;
利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值,在所述稀疏损失值大于预设的稀疏阈值时,调整所述拟合器的内部参数并返回利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,直到所述稀疏损失值小于或者等于预设的稀疏阈值时,输出所述仿真数据;Use a preset second loss function to calculate the sparse loss value between the output data and the simulation data, when the sparse loss value is greater than the preset sparse threshold, adjust the internal parameters of the fitter and return Use a pre-built fitter to perform a data fitting operation on random noise data to obtain simulation data, and output the simulation data until the sparse loss value is less than or equal to a preset sparse threshold;
根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型。The to-be-compressed model is compressed according to the simulation data to obtain a compressed model.
进一步地,所述计算机可用存储介质可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据区块链节点的使用所创建的数据等。Further, the computer-usable storage medium may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, an application program required by at least one function, and the like; using the created data, etc.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided in this application, it should be understood that the disclosed apparatus, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the modules is only a logical function division, and there may be other division manners in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, each functional module in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of hardware plus software function modules.
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。It will be apparent to those skilled in the art that the present application is not limited to the details of the above-described exemplary embodiments, but that the present application can be implemented in other specific forms without departing from the spirit or essential characteristics of the present application.
因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附关联图表记视为限制所涉及的权利要求。Accordingly, the embodiments are to be regarded in all respects as illustrative and not restrictive, and the scope of the application is to be defined by the appended claims rather than the foregoing description, which is therefore intended to fall within the scope of the claims. All changes within the meaning and scope of the equivalents of , are included in this application. Any accompanying reference signs in the claims should not be construed as limiting the involved claims.
此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第二等词语用来表示名称,而并不表示任何特定的顺序。Furthermore, it is clear that the word "comprising" does not exclude other units or steps and the singular does not exclude the plural. Several units or means recited in the system claims can also be realized by one unit or means by means of software or hardware. Second-class terms are used to denote names and do not denote any particular order.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application and not to limit them. Although the present application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present application can be Modifications or equivalent substitutions can be made without departing from the spirit and scope of the technical solutions of the present application.

Claims (20)

  1. 一种模型压缩方法,其中,所述方法包括:A model compression method, wherein the method comprises:
    步骤A:利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据;Step A: use a pre-built fitter to perform a data fitting operation on random noise data to obtain simulation data;
    步骤B:利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,在所述激活损失值大于预设的激活阈值时,调整所述拟合器的参数并返回上述的步骤A,直到所述激活损失值小于或等于预设的激活阈值时,将所述仿真数据输入至待压缩模型中,得到输出数据;Step B: Calculate the activation loss value between the simulation data and the noise data by using a preset first loss function, and adjust the parameters of the fitter when the activation loss value is greater than a preset activation threshold And return to the above-mentioned step A, until the activation loss value is less than or equal to the preset activation threshold, input the simulation data into the model to be compressed to obtain output data;
    步骤C:利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值,在所述稀疏损失值大于预设的稀疏阈值时,调整所述拟合器的内部参数并返回上述的步骤A,直到所述稀疏损失值小于或者等于预设的稀疏阈值时,输出所述仿真数据;Step C: Calculate a sparse loss value between the output data and the simulation data by using a preset second loss function, and adjust the internal part of the fitter when the sparse loss value is greater than a preset sparse threshold. parameters and return to the above step A, until the sparse loss value is less than or equal to the preset sparse threshold, output the simulation data;
    步骤D:根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型。Step D: compressing the to-be-compressed model according to the simulation data to obtain a compressed model.
  2. 如权利要求1所述的模型压缩方法,其中,所述利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,包括:The model compression method according to claim 1, wherein, performing a data fitting operation on random noise data by using a pre-built fitter to obtain simulation data, comprising:
    利用所述拟合器中的长短期记忆网络对所述噪声数据进行预测,得到拟合数据集;Using the long short-term memory network in the fitter to predict the noise data to obtain a fitting data set;
    利用激活函数对所述拟合数据集进行压缩,得到压缩数据集;Compress the fitted data set by using an activation function to obtain a compressed data set;
    对所述压缩数据集进行向量化处理,得到仿真数据。Perform vectorization processing on the compressed data set to obtain simulation data.
  3. 如权利要求2所述的模型压缩方法,其中,所述对所述压缩数据集进行向量化处理,得到仿真数据,包括:The model compression method according to claim 2, wherein, performing vectorization processing on the compressed data set to obtain simulation data, comprising:
    利用Word2Vec算法将所述压缩数据集中的压缩数据映射为特征向量;Utilize the Word2Vec algorithm to map the compressed data in the compressed data set into a feature vector;
    按照所述特征向量的序列对所述特征向量进行拼接,得到所述仿真数据。The eigenvectors are spliced according to the sequence of the eigenvectors to obtain the simulation data.
  4. 如权利要求1所述的模型压缩方法,其中,所述利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,包括:The model compression method according to claim 1, wherein calculating an activation loss value between the simulation data and the noise data by using a preset first loss function comprises:
    利用下述第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值:The activation loss value between the simulated data and the noisy data is calculated using the following first loss function:
    Figure PCTCN2021083080-appb-100001
    Figure PCTCN2021083080-appb-100001
    其中,
    Figure PCTCN2021083080-appb-100002
    为所述激活损失值,n为所述噪声数据的样本数,
    Figure PCTCN2021083080-appb-100003
    为所述仿真数据中的第m个数据,||||1是L1范数。
    in,
    Figure PCTCN2021083080-appb-100002
    is the activation loss value, n is the number of samples of the noise data,
    Figure PCTCN2021083080-appb-100003
    is the mth data in the simulation data, ||||1 is the L1 norm.
  5. 如权利要求1所述的模型压缩方法,其中,所述利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值,包括:The model compression method according to claim 1, wherein calculating a sparse loss value between the output data and the simulation data by using a preset second loss function comprises:
    利用下述第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值:A sparse loss value between the output data and the simulated data is calculated using the following second loss function:
    Figure PCTCN2021083080-appb-100004
    Figure PCTCN2021083080-appb-100004
    其中,
    Figure PCTCN2021083080-appb-100005
    为所述稀疏损失值,x为所述仿真数据的样本数,
    Figure PCTCN2021083080-appb-100006
    是所述输出数据中的第m个数据,t m为预设的参数,
    Figure PCTCN2021083080-appb-100007
    为softmax损失函数。
    in,
    Figure PCTCN2021083080-appb-100005
    is the sparse loss value, x is the number of samples of the simulated data,
    Figure PCTCN2021083080-appb-100006
    is the mth data in the output data, t m is a preset parameter,
    Figure PCTCN2021083080-appb-100007
    is the softmax loss function.
  6. 如权利要求1至5中任意一项所述的模型压缩方法,其中,所述根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型,包括:The model compression method according to any one of claims 1 to 5, wherein the compressing the to-be-compressed model according to the simulation data to obtain a compressed model, comprising:
    将所述仿真数据输入至预设的标准压缩模型中进行向量运算,得到所述标准压缩模型输出的第一特征,将所述仿真数据输入至所述待压缩模型中进行向量运算,得到所述待压缩模型输出的第二特征;Inputting the simulation data into a preset standard compression model to perform vector operations to obtain the first feature output by the standard compression model, and inputting the simulation data into the to-be-compressed model to perform vector operations to obtain the the second feature output by the model to be compressed;
    根据所述第一特征和所述第二特征确定所述待压缩模型的损失函数;Determine the loss function of the to-be-compressed model according to the first feature and the second feature;
    根据所述损失函数对所述待压缩模型进行反向传播,得到压缩后的模型。The model to be compressed is back-propagated according to the loss function to obtain a compressed model.
  7. 如权利要求6所述的模型压缩方法,其中,所述根据所述第一特征和所述第二特征确定所述待压缩模型的损失函数,包括:The model compression method according to claim 6, wherein the determining the loss function of the to-be-compressed model according to the first feature and the second feature comprises:
    根据所述第一特征和所述第二特征进行求差计算,得到差值函数;Perform a difference calculation according to the first feature and the second feature to obtain a difference function;
    将所述差值函数进行范数转换处理并求其平方,得到损失函数。The difference function is subjected to norm conversion processing and squared to obtain a loss function.
  8. 一种模型压缩装置,其中,所述装置包括:A model compression device, wherein the device comprises:
    数据拟合模块,用于利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据;The data fitting module is used to perform data fitting operation on random noise data by using a pre-built fitter to obtain simulation data;
    激活损失模块,用于利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,在所述激活损失值大于预设的激活阈值时,调整所述拟合器的参数,直到所述激活损失值小于或等于预设的激活阈值时,将所述仿真数据输入至待压缩模型中,得到输出数据;an activation loss module, configured to use a preset first loss function to calculate an activation loss value between the simulation data and the noise data, and adjust the fitting when the activation loss value is greater than a preset activation threshold until the activation loss value is less than or equal to the preset activation threshold, input the simulation data into the model to be compressed to obtain output data;
    稀疏损失模块,用于利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值,在所述稀疏损失值大于预设的稀疏阈值时,调整所述拟合器的内部参数,直到所述稀疏损失值小于或者等于预设的稀疏阈值时,输出所述仿真数据;a sparse loss module, configured to use a preset second loss function to calculate a sparse loss value between the output data and the simulation data, and adjust the fitting when the sparse loss value is greater than a preset sparse threshold the internal parameters of the generator, until the sparse loss value is less than or equal to the preset sparse threshold, output the simulation data;
    模型压缩模块,用于根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型。A model compression module, configured to perform compression processing on the to-be-compressed model according to the simulation data to obtain a compressed model.
  9. 一种电子设备,其中,所述电子设备包括:An electronic device, wherein the electronic device comprises:
    至少一个处理器;以及,at least one processor; and,
    与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如下所述的模型压缩方法:The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform a model compression method as described below:
    步骤A:利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据;Step A: use a pre-built fitter to perform a data fitting operation on random noise data to obtain simulation data;
    步骤B:利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,在所述激活损失值大于预设的激活阈值时,调整所述拟合器的参数并返回上述的步骤A,直到所述激活损失值小于或等于预设的激活阈值时,将所述仿真数据输入至待压缩模型中,得到输出数据;Step B: Calculate the activation loss value between the simulation data and the noise data by using a preset first loss function, and adjust the parameters of the fitter when the activation loss value is greater than a preset activation threshold And return to the above-mentioned step A, until the activation loss value is less than or equal to the preset activation threshold, input the simulation data into the model to be compressed to obtain output data;
    步骤C:利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值,在所述稀疏损失值大于预设的稀疏阈值时,调整所述拟合器的内部参数并返回上述的步骤A,直到所述稀疏损失值小于或者等于预设的稀疏阈值时,输出所述仿真数据;Step C: Calculate a sparse loss value between the output data and the simulation data by using a preset second loss function, and adjust the internal part of the fitter when the sparse loss value is greater than a preset sparse threshold. parameters and return to the above step A, until the sparse loss value is less than or equal to the preset sparse threshold, output the simulation data;
    步骤D:根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型。Step D: compressing the to-be-compressed model according to the simulation data to obtain a compressed model.
  10. 如权利要求9所述的电子设备,其中,所述利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,包括:The electronic device according to claim 9, wherein, performing a data fitting operation on random noise data by using a pre-built fitter to obtain simulation data, comprising:
    利用所述拟合器中的长短期记忆网络对所述噪声数据进行预测,得到拟合数据集;Using the long short-term memory network in the fitter to predict the noise data to obtain a fitting data set;
    利用激活函数对所述拟合数据集进行压缩,得到压缩数据集;Compress the fitted data set by using an activation function to obtain a compressed data set;
    对所述压缩数据集进行向量化处理,得到仿真数据。Perform vectorization processing on the compressed data set to obtain simulation data.
  11. 如权利要求10所述的电子设备,其中,所述对所述压缩数据集进行向量化处理,得到仿真数据,包括:The electronic device according to claim 10, wherein, performing vectorization processing on the compressed data set to obtain simulation data, comprising:
    利用Word2Vec算法将所述压缩数据集中的压缩数据映射为特征向量;Utilize the Word2Vec algorithm to map the compressed data in the compressed data set into a feature vector;
    按照所述特征向量的序列对所述特征向量进行拼接,得到所述仿真数据。The eigenvectors are spliced according to the sequence of the eigenvectors to obtain the simulation data.
  12. 如权利要求9所述的电子设备,其中,所述利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,包括:The electronic device according to claim 9, wherein calculating an activation loss value between the simulation data and the noise data using a preset first loss function comprises:
    利用下述第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值:The activation loss value between the simulated data and the noisy data is calculated using the following first loss function:
    Figure PCTCN2021083080-appb-100008
    Figure PCTCN2021083080-appb-100008
    其中,
    Figure PCTCN2021083080-appb-100009
    为所述激活损失值,n为所述噪声数据的样本数,
    Figure PCTCN2021083080-appb-100010
    为所述仿真数据中的第m个数据,||||1是L1范数。
    in,
    Figure PCTCN2021083080-appb-100009
    is the activation loss value, n is the number of samples of the noise data,
    Figure PCTCN2021083080-appb-100010
    is the mth data in the simulation data, ||||1 is the L1 norm.
  13. 如权利要求9所述的电子设备,其中,所述利用预设的第二损失函数计算所述输 出数据和所述仿真数据之间的稀疏损失值,包括:The electronic device according to claim 9, wherein, calculating a sparse loss value between the output data and the simulation data by using a preset second loss function, comprising:
    利用下述第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值:A sparse loss value between the output data and the simulated data is calculated using the following second loss function:
    Figure PCTCN2021083080-appb-100011
    Figure PCTCN2021083080-appb-100011
    其中,
    Figure PCTCN2021083080-appb-100012
    为所述稀疏损失值,x为所述仿真数据的样本数,
    Figure PCTCN2021083080-appb-100013
    是所述输出数据中的第m个数据,t m为预设的参数,
    Figure PCTCN2021083080-appb-100014
    为softmax损失函数。
    in,
    Figure PCTCN2021083080-appb-100012
    is the sparse loss value, x is the number of samples of the simulated data,
    Figure PCTCN2021083080-appb-100013
    is the mth data in the output data, t m is a preset parameter,
    Figure PCTCN2021083080-appb-100014
    is the softmax loss function.
  14. 如权利要求9至13中任意一项所述的电子设备,其中,所述根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型,包括:The electronic device according to any one of claims 9 to 13, wherein the compressing the to-be-compressed model according to the simulation data to obtain a compressed model comprises:
    将所述仿真数据输入至预设的标准压缩模型中进行向量运算,得到所述标准压缩模型输出的第一特征,将所述仿真数据输入至所述待压缩模型中进行向量运算,得到所述待压缩模型输出的第二特征;Inputting the simulation data into a preset standard compression model to perform vector operations to obtain the first feature output by the standard compression model, and inputting the simulation data into the to-be-compressed model to perform vector operations to obtain the the second feature output by the model to be compressed;
    根据所述第一特征和所述第二特征确定所述待压缩模型的损失函数;Determine the loss function of the to-be-compressed model according to the first feature and the second feature;
    根据所述损失函数对所述待压缩模型进行反向传播,得到压缩后的模型。The model to be compressed is back-propagated according to the loss function to obtain a compressed model.
  15. 一种计算机可读存储介质,存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下所述的模型压缩方法:A computer-readable storage medium storing a computer program, wherein when the computer program is executed by a processor, the following model compression method is implemented:
    步骤A:利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据;Step A: use a pre-built fitter to perform a data fitting operation on random noise data to obtain simulation data;
    步骤B:利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,在所述激活损失值大于预设的激活阈值时,调整所述拟合器的参数并返回上述的步骤A,直到所述激活损失值小于或等于预设的激活阈值时,将所述仿真数据输入至待压缩模型中,得到输出数据;Step B: Calculate the activation loss value between the simulation data and the noise data by using a preset first loss function, and adjust the parameters of the fitter when the activation loss value is greater than a preset activation threshold And return to the above-mentioned step A, until the activation loss value is less than or equal to the preset activation threshold, input the simulation data into the model to be compressed to obtain output data;
    步骤C:利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值,在所述稀疏损失值大于预设的稀疏阈值时,调整所述拟合器的内部参数并返回上述的步骤A,直到所述稀疏损失值小于或者等于预设的稀疏阈值时,输出所述仿真数据;Step C: Calculate a sparse loss value between the output data and the simulation data by using a preset second loss function, and adjust the internal part of the fitter when the sparse loss value is greater than a preset sparse threshold. parameters and return to the above step A, until the sparse loss value is less than or equal to the preset sparse threshold, output the simulation data;
    步骤D:根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型。Step D: compressing the to-be-compressed model according to the simulation data to obtain a compressed model.
  16. 如权利要求15所述的计算机可读存储介质,其中,所述利用预先构建的拟合器对随机噪声数据进行数据拟合操作,得到仿真数据,包括:The computer-readable storage medium according to claim 15, wherein the data fitting operation performed on random noise data by using a pre-built fitter to obtain simulation data comprises:
    利用所述拟合器中的长短期记忆网络对所述噪声数据进行预测,得到拟合数据集;Using the long short-term memory network in the fitter to predict the noise data to obtain a fitting data set;
    利用激活函数对所述拟合数据集进行压缩,得到压缩数据集;Compress the fitted data set by using an activation function to obtain a compressed data set;
    对所述压缩数据集进行向量化处理,得到仿真数据。Perform vectorization processing on the compressed data set to obtain simulation data.
  17. 如权利要求16所述的计算机可读存储介质,其中,所述对所述压缩数据集进行向量化处理,得到仿真数据,包括:The computer-readable storage medium according to claim 16, wherein the performing vectorization processing on the compressed data set to obtain simulation data comprises:
    利用Word2Vec算法将所述压缩数据集中的压缩数据映射为特征向量;Utilize the Word2Vec algorithm to map the compressed data in the compressed data set into a feature vector;
    按照所述特征向量的序列对所述特征向量进行拼接,得到所述仿真数据。The eigenvectors are spliced according to the sequence of the eigenvectors to obtain the simulation data.
  18. 如权利要求15所述的计算机可读存储介质,其中,所述利用预设的第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值,包括:The computer-readable storage medium of claim 15, wherein calculating an activation loss value between the simulation data and the noise data using a preset first loss function comprises:
    利用下述第一损失函数计算所述仿真数据和所述噪声数据之间的激活损失值:The activation loss value between the simulated data and the noisy data is calculated using the following first loss function:
    Figure PCTCN2021083080-appb-100015
    Figure PCTCN2021083080-appb-100015
    其中,
    Figure PCTCN2021083080-appb-100016
    为所述激活损失值,n为所述噪声数据的样本数,
    Figure PCTCN2021083080-appb-100017
    为所述仿真数据中的第m个数据,||||1是L1范数。
    in,
    Figure PCTCN2021083080-appb-100016
    is the activation loss value, n is the number of samples of the noise data,
    Figure PCTCN2021083080-appb-100017
    is the mth data in the simulation data, ||||1 is the L1 norm.
  19. 如权利要求15所述的计算机可读存储介质,其中,所述利用预设的第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值,包括:The computer-readable storage medium of claim 15, wherein calculating a sparse loss value between the output data and the simulation data using a preset second loss function comprises:
    利用下述第二损失函数计算所述输出数据和所述仿真数据之间的稀疏损失值:A sparse loss value between the output data and the simulated data is calculated using the following second loss function:
    Figure PCTCN2021083080-appb-100018
    Figure PCTCN2021083080-appb-100018
    其中,
    Figure PCTCN2021083080-appb-100019
    为所述稀疏损失值,x为所述仿真数据的样本数,
    Figure PCTCN2021083080-appb-100020
    是所述输出数据中的第m个数据,t m为预设的参数,
    Figure PCTCN2021083080-appb-100021
    为softmax损失函数。
    in,
    Figure PCTCN2021083080-appb-100019
    is the sparse loss value, x is the number of samples of the simulated data,
    Figure PCTCN2021083080-appb-100020
    is the mth data in the output data, t m is a preset parameter,
    Figure PCTCN2021083080-appb-100021
    is the softmax loss function.
  20. 如权利要求15至19中任意一项所述的计算机可读存储介质,其中,所述根据所述仿真数据对所述待压缩模型进行压缩处理,得到压缩后的模型,包括:The computer-readable storage medium according to any one of claims 15 to 19, wherein the performing compression processing on the to-be-compressed model according to the simulation data to obtain a compressed model comprises:
    将所述仿真数据输入至预设的标准压缩模型中进行向量运算,得到所述标准压缩模型输出的第一特征,将所述仿真数据输入至所述待压缩模型中进行向量运算,得到所述待压缩模型输出的第二特征;Inputting the simulation data into a preset standard compression model to perform vector operations to obtain the first feature output by the standard compression model, and inputting the simulation data into the to-be-compressed model to perform vector operations to obtain the the second feature output by the model to be compressed;
    根据所述第一特征和所述第二特征确定所述待压缩模型的损失函数;Determine the loss function of the to-be-compressed model according to the first feature and the second feature;
    根据所述损失函数对所述待压缩模型进行反向传播,得到压缩后的模型。The model to be compressed is back-propagated according to the loss function to obtain a compressed model.
PCT/CN2021/083080 2020-12-18 2021-03-25 Model compression method and apparatus, electronic device, and medium WO2022126902A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011501677.5 2020-12-18
CN202011501677.5A CN112465141B (en) 2020-12-18 2020-12-18 Model compression method, device, electronic equipment and medium

Publications (1)

Publication Number Publication Date
WO2022126902A1 true WO2022126902A1 (en) 2022-06-23

Family

ID=74803596

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/083080 WO2022126902A1 (en) 2020-12-18 2021-03-25 Model compression method and apparatus, electronic device, and medium

Country Status (2)

Country Link
CN (1) CN112465141B (en)
WO (1) WO2022126902A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115269940A (en) * 2022-09-30 2022-11-01 佳卓智能科技(南通)有限责任公司 Data compression method of ERP management system

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112465141B (en) * 2020-12-18 2024-06-28 平安科技(深圳)有限公司 Model compression method, device, electronic equipment and medium
CN112508194B (en) * 2021-02-02 2022-03-18 支付宝(杭州)信息技术有限公司 Model compression method, system and computing equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190392323A1 (en) * 2018-06-22 2019-12-26 Moffett AI, Inc. Neural network acceleration and embedding compression systems and methods with activation sparsification
CN110874550A (en) * 2018-08-31 2020-03-10 华为技术有限公司 Data processing method, device, equipment and system
CN111612143A (en) * 2020-05-22 2020-09-01 中国科学院自动化研究所 Compression method and system of deep convolutional neural network
CN112465141A (en) * 2020-12-18 2021-03-09 平安科技(深圳)有限公司 Model compression method, model compression device, electronic device and medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108564129B (en) * 2018-04-24 2020-09-08 电子科技大学 Trajectory data classification method based on generation countermeasure network
CN109063742B (en) * 2018-07-06 2023-04-18 平安科技(深圳)有限公司 Butterfly identification network construction method and device, computer equipment and storage medium
CN109784474B (en) * 2018-12-24 2020-12-11 宜通世纪物联网研究院(广州)有限公司 Deep learning model compression method and device, storage medium and terminal equipment
CN109919864A (en) * 2019-02-20 2019-06-21 重庆邮电大学 A kind of compression of images cognitive method based on sparse denoising autoencoder network
EP3716158A3 (en) * 2019-03-25 2020-11-25 Nokia Technologies Oy Compressing weight updates for decoder-side neural networks
CN111814962B (en) * 2020-07-09 2024-05-10 平安科技(深圳)有限公司 Parameter acquisition method and device for identification model, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190392323A1 (en) * 2018-06-22 2019-12-26 Moffett AI, Inc. Neural network acceleration and embedding compression systems and methods with activation sparsification
CN110874550A (en) * 2018-08-31 2020-03-10 华为技术有限公司 Data processing method, device, equipment and system
CN111612143A (en) * 2020-05-22 2020-09-01 中国科学院自动化研究所 Compression method and system of deep convolutional neural network
CN112465141A (en) * 2020-12-18 2021-03-09 平安科技(深圳)有限公司 Model compression method, model compression device, electronic device and medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115269940A (en) * 2022-09-30 2022-11-01 佳卓智能科技(南通)有限责任公司 Data compression method of ERP management system

Also Published As

Publication number Publication date
CN112465141B (en) 2024-06-28
CN112465141A (en) 2021-03-09

Similar Documents

Publication Publication Date Title
WO2022126902A1 (en) Model compression method and apparatus, electronic device, and medium
WO2022007823A1 (en) Text data processing method and device
CN108520220B (en) Model generation method and device
US11783227B2 (en) Method, apparatus, device and readable medium for transfer learning in machine learning
WO2022213465A1 (en) Neural network-based image recognition method and apparatus, electronic device, and medium
WO2022116420A1 (en) Speech event detection method and apparatus, electronic device, and computer storage medium
CN111523640B (en) Training method and device for neural network model
CN111414987A (en) Training method and training device for neural network and electronic equipment
WO2022228425A1 (en) Model training method and apparatus
CN113434683B (en) Text classification method, device, medium and electronic equipment
CN112949433B (en) Method, device and equipment for generating video classification model and storage medium
CN113159284A (en) Model training method and device
CN112269875B (en) Text classification method, device, electronic equipment and storage medium
CN113409823B (en) Voice emotion recognition method and device, electronic equipment and storage medium
WO2022141840A1 (en) Network architecture search method and apparatus, electronic device, and medium
US20240311931A1 (en) Method, apparatus, device, and storage medium for clustering extraction of entity relationships
WO2022141867A1 (en) Speech recognition method and apparatus, and electronic device and readable storage medium
WO2021208700A1 (en) Method and apparatus for speech data selection, electronic device, and storage medium
CN112749525B (en) Simulation method and apparatus for semiconductor device, server, and storage medium
CN113228056B (en) Runtime hardware simulation method, device, equipment and storage medium
CN116701635A (en) Training video text classification method, training video text classification device, training video text classification equipment and storage medium
CN116468025A (en) Electronic medical record structuring method and device, electronic equipment and storage medium
CN116362301A (en) Model quantization method and related equipment
WO2022222228A1 (en) Method and apparatus for recognizing bad textual information, and electronic device and storage medium
WO2022141838A1 (en) Model confidence analysis method and apparatus, electronic device and computer storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21904842

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21904842

Country of ref document: EP

Kind code of ref document: A1