TW202145142A - Method and apparatus of quantization training, image processing, and storage medium - Google Patents

Method and apparatus of quantization training, image processing, and storage medium Download PDF

Info

Publication number
TW202145142A
TW202145142A TW110117531A TW110117531A TW202145142A TW 202145142 A TW202145142 A TW 202145142A TW 110117531 A TW110117531 A TW 110117531A TW 110117531 A TW110117531 A TW 110117531A TW 202145142 A TW202145142 A TW 202145142A
Authority
TW
Taiwan
Prior art keywords
model
quantization
training
quantitative
test
Prior art date
Application number
TW110117531A
Other languages
Chinese (zh)
Inventor
吉小洪
許志耿
陳凱亮
顏深根
張行程
Original Assignee
大陸商上海商湯智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 大陸商上海商湯智能科技有限公司 filed Critical 大陸商上海商湯智能科技有限公司
Publication of TW202145142A publication Critical patent/TW202145142A/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • G06F17/153Multidimensional correlation or convolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The present disclosure provides a method and an apparatus of quantization training, image processing, and a storage medium. The quantization training method includes: obtaining a first quantization model by performing at least one round of iteration quantization training on a neural network model; obtaining a test result of the first quantization model by testing the first quantization model through simulating a hardware deployment environment.

Description

量化訓練、圖像處理方法及裝置、儲存媒體Quantization training, image processing method and device, storage medium

本公開涉及量化訓練領域,尤其涉及量化訓練、圖像處理方法及裝置、儲存媒體。The present disclosure relates to the field of quantitative training, and in particular, to quantitative training, image processing methods and devices, and storage media.

隨著越來越多的神經網路模型需要被部署到移動設備上,推理(Inference)的效率已經成為了一個關鍵性問題。在神經網路模型部署到移動設備的情況下,需要精簡神經網路模型的結構,常用的方式包括量化。As more and more neural network models need to be deployed on mobile devices, the efficiency of inference has become a key issue. In the case of deploying a neural network model to a mobile device, the structure of the neural network model needs to be simplified, and commonly used methods include quantization.

對神經網路模型進行量化就是將神經網路模型的高精度參數用較低精度的參數來近似表示。其中,高精度參數可以包括浮點型參數,低精度參數可以包括整數型參數。量化後的神經網路模型在單位時間內能處理更多的數據,且佔用的儲存空間能進一步的減少等等。Quantizing the neural network model is to approximate the high-precision parameters of the neural network model with lower-precision parameters. The high-precision parameters may include floating-point parameters, and the low-precision parameters may include integer parameters. The quantized neural network model can process more data per unit time, and the storage space occupied can be further reduced, and so on.

目前,對神經網路模型的量化訓練過程一般是根據經驗指定量化訓練的總次數,在達到總次數後,將得到的量化模型轉換為在實際的硬體環境中所對應的測試模型,並在實際的硬體環境中運行該測試模型,得到運行結果。At present, the quantization training process for neural network models is generally to specify the total number of quantization training based on experience. After the total number of times is reached, the obtained quantization model is converted into a test model corresponding to the actual hardware environment, and the Run the test model in the actual hardware environment to get the running result.

根據本公開實施例的第一方面,提供一種量化訓練方法,所述方法包括:模型訓練設備對神經網路模型進行至少一輪迭代量化訓練,得到第一量化模型;所述模型訓練設備通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。According to a first aspect of the embodiments of the present disclosure, there is provided a quantization training method, the method comprising: a model training device performs at least one round of iterative quantization training on a neural network model to obtain a first quantization model; the model training device simulates hard The physical deployment environment is used to test the first quantitative model to obtain a test result of the first quantitative model.

在一些可選實施例中,所述模型訓練設備通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果,包括:所述模型訓練設備通過調用目標函數對所述第一量化模型進行測試,得到所述第一量化模型的測試結果,其中,所述目標函數用於模擬硬體部署環境。In some optional embodiments, the model training device tests the first quantitative model by simulating a hardware deployment environment, and obtains a test result of the first quantitative model, including: the model training device calls the The objective function tests the first quantitative model to obtain a test result of the first quantitative model, wherein the objective function is used to simulate a hardware deployment environment.

在一些可選實施例中,所述方法還包括:對所述第一量化模型進行轉換處理,得到第一測試模型,其中,所述轉換處理包括去除所述第一量化模型的至少一個目標單元,所述目標單元用於對所述神經網路模型的網路層的輸出數據和網路參數中的至少一者進行量化操作和/或去量化操作;所述模型訓練設備通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果,包括:所述模型訓練設備通過模擬硬體部署環境,對所述第一測試模型進行測試,得到所述第一量化模型的測試結果。In some optional embodiments, the method further includes: performing a conversion process on the first quantization model to obtain a first test model, wherein the conversion process includes removing at least one target unit of the first quantization model , the target unit is used to perform a quantization operation and/or a dequantization operation on at least one of the output data and network parameters of the network layer of the neural network model; the model training device is deployed by simulating hardware environment, testing the first quantitative model to obtain the test result of the first quantitative model, including: the model training device tests the first test model by simulating a hardware deployment environment, and obtains the Test results of the first quantitative model.

在一些可選實施例中,所述轉換處理是通過用於模擬所述硬體部署環境的目標函數實現的。In some optional embodiments, the conversion process is implemented by an objective function for simulating the hardware deployment environment.

在一些可選實施例中,所述模型訓練設備通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果,包括:所述模型訓練設備通過模擬硬體部署環境,利用對所述第一量化模型的測試樣本和網路參數進行定點化處理得到的定點數據,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。In some optional embodiments, the model training device tests the first quantitative model by simulating a hardware deployment environment, and obtains a test result of the first quantitative model, including: the model training device simulates The hardware deployment environment uses fixed-point data obtained by performing fixed-point processing on test samples and network parameters of the first quantitative model to test the first quantitative model to obtain a test result of the first quantitative model.

在一些可選實施例中,所述方法還包括:所述模型訓練設備對所述第一量化模型進行至少一輪迭代量化訓練,得到第二量化模型;所述模型訓練設備通過模擬硬體部署環境,對所述第二量化模型進行測試,得到所述第二量化模型的測試結果。In some optional embodiments, the method further includes: the model training device performs at least one round of iterative quantization training on the first quantization model to obtain a second quantization model; the model training device simulates a hardware deployment environment by , and test the second quantization model to obtain a test result of the second quantization model.

在一些可選實施例中,所述方法還包括:至少部分地基於所述第一量化模型的測試結果,得到對所述神經網路模型進行量化訓練的訓練策略分析結果,其中,所述訓練策略分析結果包括下列中的至少一項:終止所述神經網路模型的量化訓練、調整所述神經網路模型中至少一個網路層的量化方式、調整所述神經網路模型的後續迭代的量化訓練方式。In some optional embodiments, the method further includes: obtaining, based at least in part on a test result of the first quantization model, a training strategy analysis result of performing quantitative training on the neural network model, wherein the training The policy analysis result includes at least one of the following: terminating the quantization training of the neural network model, adjusting the quantization method of at least one network layer in the neural network model, adjusting the Quantitative training method.

在一些可選實施例中,所述模型訓練設備對所述第一量化模型進行至少一輪迭代量化訓練,得到第二量化模型,包括:所述模型訓練設備在對所述第一量化模型進行所述測試的過程中,並行地對所述第一量化模型進行至少一輪迭代量化訓練,得到所述第二量化模型。In some optional embodiments, the model training device performs at least one round of iterative quantization training on the first quantization model to obtain a second quantization model, including: the model training device is performing all the steps on the first quantization model. During the testing process, at least one round of iterative quantization training is performed on the first quantization model in parallel to obtain the second quantization model.

在一些可選實施例中,所述對所述第一量化模型進行測試,得到所述第一量化模型的測試結果,包括以下任一項:響應於對所述神經網路模型進行至少一輪迭代量化訓練的次數達到預設次數,對得到的所述第一量化模型進行測試,得到所述第一量化模型的測試結果;或者響應於基於預設測試策略確定所述第一量化模型滿足測試條件,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。In some optional embodiments, the testing of the first quantitative model to obtain a test result of the first quantitative model includes any one of the following: in response to performing at least one round of iteration on the neural network model The number of times of quantization training reaches a preset number of times, and the obtained first quantization model is tested to obtain a test result of the first quantization model; or in response to determining that the first quantization model satisfies the test condition based on a preset test strategy , test the first quantization model to obtain a test result of the first quantization model.

根據本公開實施例的第二方面,提供一種圖像處理方法,包括:將待處理圖像輸入量化模型,得到所述量化模型輸出的圖像處理結果;其中,所述量化模型是通過上述第一方面所述的方法得到的量化模型。According to a second aspect of the embodiments of the present disclosure, there is provided an image processing method, comprising: inputting an image to be processed into a quantization model, and obtaining an image processing result output by the quantization model; wherein, the quantization model is obtained by the above-mentioned No. A quantitative model obtained by the method described in one aspect.

根據本公開實施例的第三方面,提供一種量化訓練裝置,所述裝置包括:第一量化訓練模組,用於模型訓練設備對神經網路模型進行至少一輪迭代量化訓練,得到第一量化模型;第一測試模組,用於所述模型訓練設備通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。According to a third aspect of the embodiments of the present disclosure, there is provided a quantization training device, the device comprising: a first quantization training module, used by the model training device to perform at least one round of iterative quantization training on a neural network model to obtain a first quantization model a first test module for the model training device to test the first quantized model by simulating a hardware deployment environment to obtain a test result of the first quantized model.

在一些可選實施例中,所述第一測試模組包括:第一測試子模組,用於所述模型訓練設備通過調用目標函數對所述第一量化模型進行測試,得到所述第一量化模型的測試結果,其中,所述目標函數用於模擬硬體部署環境。In some optional embodiments, the first test module includes: a first test sub-module for the model training device to test the first quantitative model by calling an objective function to obtain the first Quantify the test results of the model, wherein the objective function is used to simulate a hardware deployment environment.

在一些可選實施例中,所述裝置還包括:模型轉換模組,用於對所述第一量化模型進行轉換處理,得到第一測試模型,其中,所述轉換處理包括去除所述第一量化模型的至少一個目標單元,所述目標單元用於對所述神經網路模型的網路層的輸出數據和網路參數中的至少一者進行量化操作和/或去量化操作;所述第一測試模組包括:第二測試子模組,用於所述模型訓練設備通過模擬硬體部署環境,對所述第一測試模型進行測試,得到所述第一量化模型的測試結果。In some optional embodiments, the apparatus further includes: a model conversion module, configured to perform conversion processing on the first quantization model to obtain a first test model, wherein the conversion processing includes removing the first test model. at least one target unit of the quantization model, the target unit is used to perform a quantization operation and/or a dequantization operation on at least one of the output data and network parameters of the network layer of the neural network model; the first A test module includes: a second test sub-module for the model training device to test the first test model by simulating a hardware deployment environment to obtain a test result of the first quantitative model.

在一些可選實施例中,所述轉換處理是通過用於模擬所述硬體部署環境的目標函數實現的。In some optional embodiments, the conversion process is implemented by an objective function for simulating the hardware deployment environment.

在一些可選實施例中,所述第一測試模組包括:第三測試子模組,用於所述模型訓練設備通過模擬硬體部署環境,利用對所述第一量化模型的測試樣本和網路參數進行定點化處理得到的定點數據,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。In some optional embodiments, the first test module includes: a third test sub-module for the model training device to simulate a hardware deployment environment and use the test samples and The fixed-point data obtained by performing fixed-point processing on network parameters is used to test the first quantization model to obtain a test result of the first quantization model.

在一些可選實施例中,所述裝置還包括:第二量化訓練模組,用於所述模型訓練設備對所述第一量化模型進行至少一輪迭代量化訓練,得到第二量化模型;第二測試模組,用於所述模型訓練設備通過模擬硬體部署環境,對所述第二量化模型進行測試,得到所述第二量化模型的測試結果。In some optional embodiments, the apparatus further includes: a second quantization training module, used for the model training device to perform at least one round of iterative quantization training on the first quantization model to obtain a second quantization model; a second quantization model; A test module is used for the model training device to test the second quantization model by simulating a hardware deployment environment to obtain a test result of the second quantization model.

在一些可選實施例中,所述裝置還包括:確定模組,用於至少部分地基於所述第一量化模型的測試結果,得到對所述神經網路模型進行量化訓練的訓練策略分析結果,其中,所述訓練策略分析結果包括下列中的至少一項:終止所述神經網路模型的量化訓練、調整所述神經網路模型中至少一個網路層的量化方式、調整所述神經網路模型的後續迭代的量化訓練方式。In some optional embodiments, the apparatus further includes: a determination module configured to obtain a training strategy analysis result for quantitative training of the neural network model based at least in part on the test result of the first quantitative model , wherein the training strategy analysis result includes at least one of the following: terminating the quantization training of the neural network model, adjusting the quantization method of at least one network layer in the neural network model, adjusting the neural network model Quantized training method for subsequent iterations of the road model.

在一些可選實施例中,所述第二量化訓練模組包括:量化訓練子模組,用於所述模型訓練設備在對所述第一量化模型進行所述測試的過程中,並行地對所述第一量化模型進行至少一輪迭代量化訓練,得到所述第二量化模型。In some optional embodiments, the second quantization training module includes: a quantization training sub-module for the model training device to perform the test on the first quantization model in parallel The first quantization model performs at least one round of iterative quantization training to obtain the second quantization model.

在一些可選實施例中,所述第一測試模組包括以下任一項:第四測試子模組,用於響應於對所述神經網路模型進行至少一輪迭代量化訓練的次數達到預設次數,對得到的所述第一量化模型進行測試,得到所述第一量化模型的測試結果;或者第五測試子模組,用於響應於基於預設測試策略確定所述第一量化模型滿足測試條件,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。In some optional embodiments, the first test module includes any one of the following: a fourth test sub-module, configured to respond to at least one round of iterative quantization training on the neural network model reaching a preset number of times number of times, the obtained first quantitative model is tested to obtain the test result of the first quantitative model; or a fifth test sub-module is used for determining that the first quantitative model meets the requirements based on the preset test strategy. Test conditions, test the first quantitative model to obtain a test result of the first quantitative model.

根據本公開實施例的第四方面,提供一種圖像處理裝置,所述裝置包括:圖像處理模組,用於將待處理圖像輸入量化模型,得到所述量化模型輸出的圖像處理結果;其中,所述量化模型是通過第一方面所述的方法得到的量化模型。According to a fourth aspect of the embodiments of the present disclosure, there is provided an image processing apparatus, the apparatus comprising: an image processing module, configured to input an image to be processed into a quantization model, and obtain an image processing result output by the quantization model ; wherein, the quantization model is a quantization model obtained by the method described in the first aspect.

根據本公開實施例的第五方面,提供一種計算機可讀儲存媒體,所述儲存媒體儲存有計算機程式,所述計算機程式用於執行上述第一方面所述的量化訓練方法或上述第二方面所述的圖像處理方法。According to a fifth aspect of the embodiments of the present disclosure, a computer-readable storage medium is provided, the storage medium stores a computer program, and the computer program is used to execute the quantization training method described in the above-mentioned first aspect or the above-mentioned second aspect. The described image processing method.

根據本公開實施例的第六方面,提供一種量化訓練裝置,包括:處理器;用於儲存所述處理器可執行指令的記憶體;其中,所述處理器被配置為調用所述記憶體中儲存的可執行指令,實現第一方面所述的量化訓練方法。According to a sixth aspect of the embodiments of the present disclosure, there is provided a quantization training apparatus, comprising: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to call the memory in the memory The stored executable instructions implement the quantization training method described in the first aspect.

根據本公開實施例的第七方面,提供一種圖像處理裝置,包括:處理器;用於儲存所述處理器可執行指令的記憶體;其中,所述處理器被配置為調用所述記憶體中儲存的可執行指令,實現第二方面所述的圖像處理方法。According to a seventh aspect of the embodiments of the present disclosure, there is provided an image processing apparatus, comprising: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to call the memory The executable instructions stored in the image processing method implement the image processing method described in the second aspect.

本公開的實施例提供的技術方案可以包括以下有益效果:The technical solutions provided by the embodiments of the present disclosure may include the following beneficial effects:

本公開實施例中,可以通過模型訓練設備對神經網路模型進行至少一輪迭代量化訓練,得到第一量化模型,進一步地,模型訓練設備再通過模擬硬體部署環境,對該第一量化模型進行測試,從而得到第一量化模型的測試結果。本公開可以通過模型訓練設備在進行至少一輪迭代量化訓練之後,模擬硬體部署環境,直接在訓練框架下對第一量化模型進行測試,無需將第一量化模型部署到實際硬體設備上就可以得到第一量化模型的測試結果,有利於提高神經網路模型的開發效率。In the embodiment of the present disclosure, at least one round of iterative quantization training can be performed on the neural network model by the model training device to obtain the first quantized model. test, so as to obtain the test result of the first quantitative model. In the present disclosure, after at least one round of iterative quantization training is performed by the model training device, the hardware deployment environment can be simulated, and the first quantization model can be directly tested under the training framework without deploying the first quantization model on the actual hardware device. Obtaining the test result of the first quantitative model is beneficial to improve the development efficiency of the neural network model.

應當理解的是,以上的一般描述和後文的細節描述僅是示例性和解釋性的,並不能限制本公開。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.

這裡將詳細地對示例性實施例進行說明,其示例表示在附圖中。下面的描述涉及附圖時,除非另有表示,不同附圖中的相同數字表示相同或相似的要素。以下示例性實施例中所描述的實施方式並不代表與本公開相一致的所有實施方式。相反,它們僅是與如所附權利要求書中所詳述的、本公開的一些方面相一致的裝置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. Where the following description refers to the drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the illustrative examples below are not intended to represent all implementations consistent with this disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as recited in the appended claims.

在本公開使用的術語是僅僅出於描述特定實施例的目的,而非旨在限制本公開。在本公開和所附權利要求書中所使用的單數形式的“一種”、“所述”和“該”也旨在包括多數形式,除非上下文清楚地表示其他含義。還應當理解,本文中運行的術語“和/或”是指並包含一個或多個相關聯的列出項目的任何或所有可能組合。The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to limit the present disclosure. As used in this disclosure and the appended claims, the singular forms "a," "the," and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.

應當理解,儘管在本公開可能採用術語第一、第二、第三等來描述各種資訊,但這些資訊不應限於這些術語。這些術語僅用來將同一類型的資訊彼此區分開。例如,在不脫離本公開範圍的情況下,第一資訊也可以被稱為第二資訊,類似地,第二資訊也可以被稱為第一資訊。取決於語境,如在此所運行的詞語“如果”可以被解釋成為“在……時”或“當……時”或“響應於確定”。It will be understood that although the terms first, second, third, etc. may be used in this disclosure to describe various information, these information should not be limited by these terms. These terms are only used to distinguish information of the same type from one another. For example, the first information may also be referred to as the second information, and similarly, the second information may also be referred to as the first information, without departing from the scope of the present disclosure. Depending on the context, the word "if" as used herein may be interpreted as "at the time of" or "when" or "in response to determining."

目前,神經網路模型在完成量化訓練後,需要部署到實際的移動設備上進行測試。具體地,在測試之前,可以對該神經網路模型進行迭代量化訓練,得到量化模型之後,將該量化模型轉換為實際的硬體環境所對應的測試模型,然後將該測試模型部署到實際的移動設備上進行測試,獲得測試結果。根據測試結果來確定量化模型的性能是否符合預期。At present, the neural network model needs to be deployed on the actual mobile device for testing after completing the quantitative training. Specifically, before the test, the neural network model can be iteratively quantized training, after obtaining the quantized model, the quantized model can be converted into a test model corresponding to the actual hardware environment, and then the test model can be deployed to the actual hardware environment. Test on mobile device and get test results. Based on the test results, it is determined whether the performance of the quantitative model is as expected.

量化的方案一般可分為訓練後量化(Post-training quantization)和訓練感知量化(Quantization-aware training)。Post- training quantization指的是在訓練出浮點型參數的神經網路模型後,對該神經網路模型的參數進行直接量化。這種方案在對參數量大的模型進行量化效果較好,性能損失較小,然而對參數量小的模型會導致大幅度的性能降低。Quantization-aware training是在神經網路模型的訓練中模擬量化行為,在訓練中用浮點型參數的形式來保存定點參數,最後模型推理的時候直接採用定點參數進行運算。Quantization schemes can generally be divided into post-training quantization and Quantization-aware training. Post-training quantization refers to directly quantizing the parameters of the neural network model after training the neural network model with floating-point parameters. This scheme has better quantization effect for models with large parameters, and the performance loss is small. However, models with small parameters will lead to significant performance degradation. Quantization-aware training simulates the quantization behavior in the training of the neural network model. In the training, the fixed-point parameters are stored in the form of floating-point parameters. Finally, the fixed-point parameters are directly used for operation during model inference.

下面以Quantization-aware training為例,介紹一下量化訓練的過程:首先,將所有的張量操作都通過模組的方式實現,然後將所有的複用模組都改為單獨的,即不允許模組複用,並將訓練框架(如pytorch)中所有功能性的介面改為實現相同功能的模組。Let's take Quantization-aware training as an example to introduce the process of quantization training: first, all tensor operations are implemented through modules, and then all multiplexing modules are changed to separate, that is, modules are not allowed. Group reuse, and change all functional interfaces in training frameworks (such as pytorch) to modules that implement the same function.

但是,需要先通過完整的訓練過程得到量化模型之後,將訓練得到的量化模型轉換為可部署模型,並置於部署環境以測試其在部署環境上的性能,這種方案在進行模型訓練或者參數調整時非常不便,需要確保中間轉換的環節不能出錯。另外需要模型完整訓練好後,才能知道模型在真實環境中的結果,不能夠儘早地修正模型訓練過程。However, it is necessary to first obtain the quantized model through the complete training process, and then convert the trained quantized model into a deployable model and place it in the deployment environment to test its performance in the deployment environment. This solution is used for model training or parameter adjustment. It is very inconvenient, and it is necessary to ensure that the intermediate conversion link cannot be wrong. In addition, only after the model is fully trained can the results of the model in the real environment be known, and the model training process cannot be corrected as soon as possible.

此外,採用pytorch等訓練框架的不足在於,使用包括浮點型參數的量化模型進行測試,而在部署環境中使用包括定點參數的量化模型進行推理,因此測試結果與部署硬體上的推理結果會有一定差異,測試結果不能夠真實地反應量化模型在實際的硬體部署環境中的性能。此外,量化模型在訓練過程中容易出現過擬合,一旦出現過擬合的情況,不易調整量化模型中的參數。In addition, the disadvantage of using training frameworks such as pytorch is that the quantized model including floating-point parameters is used for testing, while the quantized model including fixed-point parameters is used for inference in the deployment environment, so the test results and the inference results on the deployed hardware will be different. There are certain differences, and the test results cannot truly reflect the performance of the quantitative model in the actual hardware deployment environment. In addition, the quantitative model is prone to overfitting during the training process, and once the overfitting occurs, it is not easy to adjust the parameters in the quantitative model.

本公開實施例提供了一種量化訓練方案。例如圖1所示,圖1是根據一示例性實施例示出的一種量化訓練方法,該量化訓練方法可以適用於一種模型訓練設備。在本公開實施例中,量化訓練以及量化模型的測試都可以在同一模型訓練設備上進行。該模型訓練設備可以採用部署了包括量化算法或量化工具在內的訓練平臺框架的電子設備;電子設備可以包括但不限於採用x86架構的終端設備,例如個人計算機(Personal Computer,PC)、手機、便攜式設備等。該量化訓練方法包括以下步驟:The embodiments of the present disclosure provide a quantization training scheme. For example, as shown in FIG. 1 , FIG. 1 shows a quantization training method according to an exemplary embodiment, and the quantization training method may be applied to a model training device. In the embodiment of the present disclosure, both the quantization training and the testing of the quantization model can be performed on the same model training device. The model training device may be an electronic device deployed with a training platform framework including a quantization algorithm or a quantization tool; the electronic device may include, but is not limited to, a terminal device using the x86 architecture, such as a personal computer (PC), mobile phone, Portable devices, etc. The quantitative training method includes the following steps:

在步驟101中,模型訓練設備對神經網路模型進行至少一輪迭代量化訓練,得到第一量化模型。In step 101, the model training device performs at least one round of iterative quantization training on the neural network model to obtain a first quantized model.

模型訓練設備可以採用量化訓練方式,包括但不限於quantization-aware training量化訓練方式,對神經網路模型在訓練過程中進行量化。在本公開實施例中,模型訓練設備可以對神經網路模型進行至少一輪迭代量化訓練,從而得到第一量化模型。具體地,模型訓練設備可以進行一次或數次等有限次的迭代訓練,得到第一量化模型。The model training device can use quantization training methods, including but not limited to quantization-aware training, to quantify the neural network model during the training process. In the embodiment of the present disclosure, the model training device may perform at least one round of iterative quantization training on the neural network model, thereby obtaining the first quantization model. Specifically, the model training device may perform limited iteration training, such as one or several times, to obtain the first quantized model.

在步驟102中,所述模型訓練設備通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。In step 102, the model training device tests the first quantitative model by simulating a hardware deployment environment, and obtains a test result of the first quantitative model.

在本公開實施例中,可以在模型訓練設備上模擬實際的硬體部署環境,以便直接在模型訓練設備上對第一量化模型進行測試,無需將第一量化模型轉換到實際硬體部署環境中進行測試。根據第一量化模型的測試結果可以進行量化訓練的訓練策略分析,例如可以再次對第一量化模型進行至少一輪的迭代量化訓練等。In the embodiment of the present disclosure, an actual hardware deployment environment can be simulated on the model training device, so that the first quantitative model can be tested directly on the model training device without converting the first quantitative model to the actual hardware deployment environment carry out testing. According to the test result of the first quantization model, a training strategy analysis of the quantization training may be performed, for example, at least one round of iterative quantization training may be performed on the first quantization model again.

在一些實施例中,模擬硬體部署環境是指基於硬體部署環境的運算邏輯,通過模組化的方式封裝出至少一個介面,在測試時,可以調用至少一個介面中與第一量化模型的網路結構對應的介面,從而運行第一量化模型,得到運行結果,將該運行結果作為第一量化模型的測試結果。其中,至少一個介面分別用於實現不同的網路層的功能,網路層包括但不限於卷積層(conv)、池化層(pooling)、線性層(linear)、激活函數層(prelu)等。In some embodiments, simulating the hardware deployment environment refers to computing logic based on the hardware deployment environment, encapsulating at least one interface in a modular manner, and during testing, the at least one interface and the first quantitative model can be called. The interface corresponding to the network structure runs the first quantitative model to obtain the running result, and the running result is used as the test result of the first quantitative model. Among them, at least one interface is respectively used to realize the functions of different network layers, and the network layers include but are not limited to convolution layer (conv), pooling layer (pooling), linear layer (linear), activation function layer (prelu), etc. .

例如,第一量化模型需要運行在移動設備的圖形處理器(Graphics Processing Unit,GPU)上,則該硬體部署環境是該GPU的硬體環境。通過至少一個介面可以基於GPU的運算邏輯,來實現不同的網路層的功能,模型訓練設備可以調用至少一個介面中與第一量化模型的網路結構對應的介面,運行第一量化模型,從而得到第一量化模型的測量結果。For example, if the first quantitative model needs to run on a graphics processor (Graphics Processing Unit, GPU) of the mobile device, the hardware deployment environment is the hardware environment of the GPU. The functions of different network layers can be realized through at least one interface based on the computing logic of the GPU, and the model training device can call the interface corresponding to the network structure of the first quantization model in the at least one interface to run the first quantization model, thereby A measurement result of the first quantitative model is obtained.

上述實施例中,可以通過模型訓練設備在進行至少一輪迭代量化訓練之後,模擬硬體部署環境,直接在訓練框架下對第一量化模型進行測試,無需將第一量化模型部署到實際硬體設備上就可以得到第一量化模型的測試結果,有利於提高神經網路模型的開發效率。In the above embodiment, after at least one round of iterative quantization training is performed by the model training device, the hardware deployment environment can be simulated, and the first quantization model can be tested directly under the training framework, without deploying the first quantization model to the actual hardware device. The test result of the first quantitative model can be obtained on the above, which is beneficial to improve the development efficiency of the neural network model.

在一些可選實施例中,針對上述步驟102,模型訓練設備可以將基於硬體部署環境的運算邏輯,通過模組化的方式封裝出的至少一個介面設置在目標函數中,即目標函數可以用於模擬硬體部署環境。在得到第一量化模型後,採用函數調用的方式,就可以調用該目標函數中與第一量化模型的網路結構對應的介面,從而得到第一量化模型的測試結果。In some optional embodiments, for the above step 102, the model training device can set at least one interface encapsulated in a modular way based on the computing logic of the hardware deployment environment in the objective function, that is, the objective function can use for simulating a hardware deployment environment. After the first quantization model is obtained, the interface corresponding to the network structure of the first quantization model in the objective function can be called by means of a function call, thereby obtaining the test result of the first quantization model.

在一種可能地實現方式中,模型訓練平臺的量化訓練框架採用pytorch框架,相應地,目標函數可以是該量化訓練框架中的model.quant()函數。In a possible implementation manner, the quantization training framework of the model training platform adopts the pytorch framework, and accordingly, the objective function may be the model.quant() function in the quantization training framework.

上述實施例中,模型訓練平臺可以調用目標模組,對第一量化模型進行測試,獲得第一量化模型的測試結果。實現了在同一硬體設備上模擬硬體部署環境,不僅可以進行迭代量化訓練,還可以直接得到第一量化模型的測試結果的目的,有利於提高神經網路模型的開發效率。In the above embodiment, the model training platform can call the target module to test the first quantitative model, and obtain the test result of the first quantitative model. By simulating the hardware deployment environment on the same hardware device, not only can iterative quantitative training be performed, but also the test results of the first quantitative model can be directly obtained, which is beneficial to improve the development efficiency of the neural network model.

在一些可選實施例中,在迭代量化訓練過程中,會在神經網路模型的每個網路層的輸出增加一個偽量化層,通過偽量化層可以模擬該網路層的量化帶來的精度損失。In some optional embodiments, during the iterative quantization training process, a pseudo-quantization layer is added to the output of each network layer of the neural network model, and the pseudo-quantization layer can simulate the effects brought by the quantization of the network layer. Loss of precision.

例如,網路層1的原始輸出數據具有第一精度,假設第一精度為FP32,且數值為1.1,偽量化層對原始輸出數據量化得到第二精度的數值,假設第二精度為uint8,數值為1,但是將網路層1的輸出作為輸入的網路層2也需要採用第一精度的數值,因此還需要通過偽量化層對網路層1量化得到的數值進行去量化,將第二精度的數值1轉換為第一精度,數值不變,根據上述過程,可以通過偽量化層確定網路層1的精度損失為1.1-1=0.1。採用同樣的方式,可以確定其他網路層的精度損失。For example, the original output data of network layer 1 has the first precision, assuming that the first precision is FP32, and the value is 1.1, the pseudo-quantization layer quantizes the original output data to obtain the second precision value, assuming that the second precision is uint8, the numerical value is 1, but the network layer 2 that takes the output of network layer 1 as input also needs to use the value of the first precision, so it is also necessary to dequantize the value obtained by the quantization of network layer 1 through the pseudo-quantization layer, and the second The precision value 1 is converted to the first precision, and the value remains unchanged. According to the above process, the precision loss of the network layer 1 can be determined as 1.1-1=0.1 through the pseudo quantization layer. In the same way, the accuracy loss of other network layers can be determined.

在一些可選實施例中,例如圖2所示,在執行步驟102之前,該方法還可以包括步驟103。In some optional embodiments, such as shown in FIG. 2 , before step 102 is performed, the method may further include step 103 .

在步驟103中,對所述第一量化模型進行轉換處理,得到第一測試模型。In step 103, a conversion process is performed on the first quantization model to obtain a first test model.

在本公開實施例中,在對第一量化模型進行測試的過程可以去掉其中的偽量化層,採用與實際部署環境中相同的模型結構進行測試,從而可以直接模擬量化訓練得到的神經網路模型在實際部署硬體上的運行,得到較為真實的測試結果。因此,在本公開實施例中,需要對第一量化模型進行轉換處理,轉換處理包括去除所述第一量化模型的至少一個偽量化層,在本公開實施例中,偽量化層可以通過模組化的方式實現,即每個偽量化層可以對應一個目標單元,則相應地,轉換處理就包括去除第一量化模型的至少一個目標單元,所述目標單元用於對所述神經網路模型的網路層的輸出數據和網路參數中的至少一者進行量化操作和/或去量化操作。In the embodiment of the present disclosure, in the process of testing the first quantization model, the pseudo quantization layer can be removed, and the same model structure as in the actual deployment environment can be used for testing, so that the neural network model obtained by quantization training can be directly simulated Run on the actual deployment hardware to get more realistic test results. Therefore, in the embodiment of the present disclosure, it is necessary to perform conversion processing on the first quantization model, and the conversion processing includes removing at least one pseudo-quantization layer of the first quantization model. In the embodiment of the present disclosure, the pseudo-quantization layer can pass through the module It is realized in a quantized way, that is, each pseudo-quantization layer can correspond to a target unit, and accordingly, the conversion process includes removing at least one target unit of the first quantization model, and the target unit is used for the neural network model. At least one of the output data of the network layer and the network parameters is subjected to a quantization operation and/or a dequantization operation.

相應地,步驟102可以包括:Accordingly, step 102 may include:

所述模型訓練設備通過模擬硬體部署環境,對所述第一測試模型進行測試,得到所述第一量化模型的測試結果。The model training device tests the first test model by simulating a hardware deployment environment to obtain a test result of the first quantitative model.

例如,神經網路模型的原始網路結構如圖3A所示,包括至少一個目標單元的第一量化模型如圖3B所示,其中目標單元1對卷積層的網路參數,即權重值進行量化操作和/或去量化操作,目標單元2對激活函數層的輸出數據進行量化操作和/或去量化操作。轉換處理就需要在第一量化模型中去除目標單元,得到第一測試模型如圖3C所示。For example, the original network structure of the neural network model is shown in FIG. 3A , and the first quantization model including at least one target unit is shown in FIG. 3B , in which the target unit 1 quantifies the network parameters of the convolution layer, that is, the weight value. operation and/or dequantization operation, the target unit 2 performs a quantization operation and/or a dequantization operation on the output data of the activation function layer. In the conversion process, the target unit needs to be removed from the first quantization model, and the first test model is obtained as shown in FIG. 3C .

本公開就是由模型訓練設備對第一測試模型進行測試,從而得到第一量化模型的測試結果。In the present disclosure, the first test model is tested by the model training device, so as to obtain the test result of the first quantitative model.

上述實施例中,所述第一量化模型需要對神經網路模型的網路層的輸出數據和網路參數中的至少一者進行量化操作和/或去量化操作,然而,在測試過程中,不再需要進行量化操作和/或去量化操作,因此,可以對第一量化模型進行轉換處理,去除第一量化模型中的至少一個目標單元,得到第一測試模型,針對第一測試模型進行測試,從而可以得到所需要的第一量化模型的測試結果,實現了在模型訓練設備上對第一量化模型進行測試的目的。In the above embodiment, the first quantization model needs to perform a quantization operation and/or a dequantization operation on at least one of the output data and network parameters of the network layer of the neural network model. However, during the testing process, It is no longer necessary to perform quantization operation and/or dequantization operation. Therefore, the first quantization model can be converted to remove at least one target unit in the first quantization model to obtain a first test model, and test the first test model. , so that the required test result of the first quantization model can be obtained, and the purpose of testing the first quantization model on the model training device is achieved.

在一些可選實施例中,模型訓練設備對第一量化模型進行轉換處理的過程,是通過用於模擬所述硬體部署環境的目標函數實現的。In some optional embodiments, the process of converting the first quantization model by the model training device is implemented by using an objective function for simulating the hardware deployment environment.

模型訓練設備在得到第一量化模型後,調用目標函數,從而對第一量化模型進行轉換處理,去除至少一個目標單元,從而對得到的第一測試模型進行測試,得到第一量化模型的測試結果。After obtaining the first quantized model, the model training device calls the objective function, thereby converting the first quantized model to remove at least one target unit, so as to test the obtained first test model and obtain the test result of the first quantized model .

在一些可選實施例中,對第一量化模型進行測試時,可以先對第一量化模型的測試樣本和網路參數進行定點化處理,得到定點數據,然後利用獲得的定點數據對第一量化模型進行測試,從而得到測試結果。In some optional embodiments, when testing the first quantization model, the test samples and network parameters of the first quantization model may be fixed-pointed to obtain fixed-point data, and then the The model is tested and the test results are obtained.

定點化處理是指將數據由第一精度轉換為第二精度,其中,第一精度高於第二精度,例如第一精度為浮點精度FP32,第二精度為整型精度uint8。Fixed-point processing refers to converting data from a first precision to a second precision, where the first precision is higher than the second precision, for example, the first precision is floating point precision FP32, and the second precision is integer precision uint8.

模型訓練設備需要對整個第一量化模型輸入的測試樣本和網路參數,例如權重值進行定點化處理,然後得到定點數據,基於該定點數據進行測試。The model training device needs to perform fixed-point processing on the test samples and network parameters input by the entire first quantization model, such as weight values, and then obtain fixed-point data, and test based on the fixed-point data.

例如圖3B中,第一量化模型的測試樣本(輸入值)和網路參數(輸入值所對應的權重值)的精度均為FP32,定點化處理後的定點數據的精度為uint8,圖3C中第一測試模型可以基於uint8的定點數據進行測試,得到第一量化模型的測試結果。For example, in Fig. 3B, the accuracy of the test samples (input values) and network parameters (weight values corresponding to the input values) of the first quantization model are both FP32, and the accuracy of the fixed-point data after fixed-point processing is uint8. In Fig. 3C The first test model may be tested based on the fixed-point data of uint8 to obtain a test result of the first quantitative model.

上述實施例中,模型訓練設備需要利用對第一量化模型的測試樣本和網路參數進行定點化處理得到的定點數據,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果,使得測試過程更加合理、準確。In the above embodiment, the model training device needs to use the fixed-point data obtained by performing fixed-point processing on the test samples and network parameters of the first quantitative model to test the first quantitative model, and obtain the test of the first quantitative model. As a result, the testing process is more reasonable and accurate.

在一些可選實施例中,例如圖4所示,上述方法還可以包括步驟104和105。In some optional embodiments, such as shown in FIG. 4 , the above method may further include steps 104 and 105 .

在步驟104中,所述模型訓練設備對所述第一量化模型進行至少一輪迭代量化訓練,得到第二量化模型。In step 104, the model training device performs at least one round of iterative quantization training on the first quantization model to obtain a second quantization model.

在本公開實施例中,模型訓練設備可以在得到第一量化模型之後,繼續對第一量化模型進行至少一輪迭代量化訓練,從而得到第二量化模型。In the embodiment of the present disclosure, after obtaining the first quantization model, the model training device may continue to perform at least one round of iterative quantization training on the first quantization model, thereby obtaining the second quantization model.

在一種可能地實現方式中,模型訓練設備可以在第一量化模型的測試結果指示第一量化模型未達到量化訓練要求,例如對測試樣本進行處理的準確率未達到神經網路模型的設計要求時,繼續對第一量化模型進行至少一輪迭代量化訓練,得到該第二量化模型。In a possible implementation manner, the model training device may, when the test result of the first quantitative model indicates that the first quantitative model does not meet the quantitative training requirements, for example, when the accuracy of processing the test samples does not meet the design requirements of the neural network model , and continue to perform at least one round of iterative quantization training on the first quantization model to obtain the second quantization model.

在另一種可能地實現方式中,模型訓練設備還可以在對第一量化模型進行測試的同時,對所述第一量化模型進行至少一輪迭代量化訓練,得到第二量化模型。如果第一量化模型的測試結果指示第一量化模型未達到量化訓練要求,那麼相當於已經提前開始進行對第一量化模型的至少一輪迭代量化訓練,可用性更高。In another possible implementation manner, the model training device may also perform at least one round of iterative quantization training on the first quantization model while testing the first quantization model to obtain the second quantization model. If the test result of the first quantization model indicates that the first quantization model does not meet the quantization training requirements, it means that at least one round of iterative quantization training for the first quantization model has been started in advance, and the usability is higher.

在步驟105中,所述模型訓練設備通過模擬硬體部署環境,對所述第二量化模型進行測試,得到所述第二量化模型的測試結果。In step 105, the model training device tests the second quantization model by simulating a hardware deployment environment to obtain a test result of the second quantization model.

通過模擬硬體部署環境,模型訓練設備可以對第二量化模型進行測試,得到第二量化模型的測試結果。根據第二量化模型的測試結果可以進行量化訓練的訓練策略分析,例如可以再次對第二量化模型進行至少一輪的迭代量化訓練等。By simulating the hardware deployment environment, the model training device can test the second quantitative model, and obtain the test result of the second quantitative model. According to the test result of the second quantization model, a training strategy analysis of the quantization training may be performed, for example, at least one round of iterative quantization training may be performed on the second quantization model again.

在一些可選實施例中,例如圖5所示,在步驟102中得到第一量化模型的測試結果之後,上述方法還可以包括步驟106。In some optional embodiments, such as shown in FIG. 5 , after obtaining the test result of the first quantitative model in step 102 , the above method may further include step 106 .

在步驟106中,至少部分地基於所述第一量化模型的測試結果,得到對所述神經網路模型進行量化訓練的訓練策略分析結果。In step 106, based at least in part on the test results of the first quantization model, a training strategy analysis result for quantitative training of the neural network model is obtained.

具體地,可以基於至少一次測試的測試結果,確定神經網路模型的當前量化訓練方案是否可行,例如損失函數的設計是否合理,量化方案的設計是否合理等,或者確定當前量化訓練方案的調整方案,例如,是否需要修改網路結構、網路超參、損失函數、量化方案等中的一項或多項,或者進一步給出調整策略或其他細節資訊。在一些可選例子中,還可以確定是否需要停止當前的量化迭代訓練,例如確定需要早停,或者確定神經網路模型已經達到預期,等等,本公開實施例對此不做限定。Specifically, based on the test results of at least one test, it can be determined whether the current quantitative training scheme of the neural network model is feasible, such as whether the design of the loss function is reasonable, whether the design of the quantitative scheme is reasonable, etc., or the adjustment scheme of the current quantitative training scheme can be determined. , for example, whether to modify one or more of the network structure, network hyperparameters, loss function, quantization scheme, etc., or to further give adjustment strategies or other detailed information. In some optional examples, it can also be determined whether the current quantization iterative training needs to be stopped, for example, it is determined that early stopping is required, or it is determined that the neural network model has reached the expectation, etc., which is not limited in this embodiment of the present disclosure.

在本公開實施例中,訓練策略分析結果包括下列中的至少一項:終止所述神經網路模型的量化訓練、調整所述神經網路模型中至少一個網路層的量化方式、調整所述神經網路模型的後續迭代的量化訓練方式。In this embodiment of the present disclosure, the training strategy analysis result includes at least one of the following: terminating the quantization training of the neural network model, adjusting the quantization method of at least one network layer in the neural network model, adjusting the Quantized training method for subsequent iterations of the neural network model.

在本公開實施例中,將訓練和測試放在同一個平臺上進行,即放在同一模型訓練設備上進行,無需進行模型轉換和部署,從而優化神經網路模型的開發過程,並且在訓練過程中進行測試,可以根據第一量化模型的測試結果,可以更加方便地結合評價指標,來確定是否需要繼續對神經網路模型進行量化迭代訓練。該評價指標包括但不限於作為早停(early stop)等的評價指標。In the embodiment of the present disclosure, training and testing are performed on the same platform, that is, on the same model training device, without model conversion and deployment, thereby optimizing the development process of the neural network model, and in the training process In the test, it can be determined whether it is necessary to continue the quantitative iterative training of the neural network model according to the test result of the first quantitative model and the evaluation index can be more conveniently combined. The evaluation index includes, but is not limited to, an evaluation index such as early stop.

例如第一量化模型的測試結果指示第一量化模型未達到量化訓練要求,那麼不能終止對神經網路模型的量化訓練。再例如,第一量化模型的測試結果指示第一量化模型出現了過擬合,那麼可以根據評價指標提前終止對神經網路模型的量化訓練。For example, the test result of the first quantization model indicates that the first quantization model does not meet the quantization training requirement, so the quantization training of the neural network model cannot be terminated. For another example, if the test result of the first quantization model indicates that the first quantization model is overfitting, the quantization training of the neural network model may be terminated in advance according to the evaluation index.

根據第一量化模型的測試結果,模型訓練設備也可以調整對神經網路模型中的至少一個網路層的量化方式,例如之前採用的是Post-quantization,根據第一量化模型的測試結果,可以將量化方式可以調整為Quantization-aware training。According to the test result of the first quantization model, the model training device can also adjust the quantization method of at least one network layer in the neural network model. The quantization method can be adjusted to Quantization-aware training.

根據第一量化模型的測試結果,模型訓練設備還可以調整所述神經網路模型的後續迭代的量化訓練方式,量化訓練方式包括但不限於迭代量化訓練的次數,損失函數的調整等。例如之前對神經網路模型進行了N輪迭代量化訓練,得到第一量化模型,對第一量化模型可以進行M輪迭代量化訓練,其中,M與N均為正整數,可以相等或不相等。According to the test result of the first quantization model, the model training device can also adjust the quantization training method of the subsequent iterations of the neural network model. The quantization training method includes but is not limited to the number of iterative quantization training, the adjustment of the loss function, and the like. For example, N rounds of iterative quantization training are performed on the neural network model before, to obtain a first quantization model, and M rounds of iterative quantization training can be performed on the first quantization model, where M and N are both positive integers and may be equal or unequal.

上述實施例中,根據第一量化模型的測試結果,可以得到對神經網路模型進行量化訓練的訓練策略分析結果,基於該分析結果可以對量化訓練進行調整,使得對神經網路模型的量化訓練更加合理。In the above embodiment, according to the test result of the first quantitative model, the analysis result of the training strategy for the quantitative training of the neural network model can be obtained, and the quantitative training can be adjusted based on the analysis result, so that the quantitative training of the neural network model can be adjusted. more reasonable.

在一些可選實施例中,模型訓練設備可以基於不同的訓練時機,停止迭代量化訓練,從而得到第一量化模型。In some optional embodiments, the model training device may stop iterative quantization training based on different training timings, thereby obtaining the first quantized model.

在一種可選實現方式中,模型訓練設備可以在對神經網路模型進行至少一輪迭代量化訓練的次數達到預設次數,對得到的所述第一量化模型進行測試,得到所述第一量化模型的測試結果。預設次數可以遠小於之前設置的對神經網路模型進行迭代量化訓練的總次數,例如總次數為1000次,預設次數可以為小於1000的任意正整數,根據第一量化模型的測試結果結合早停等評價指標,可以提前終止對神經網路模型的迭代量化訓練,從而避免最終得到的量化模型出現過擬合的問題。In an optional implementation manner, the model training device may test the obtained first quantization model when the number of times of performing at least one round of iterative quantization training on the neural network model reaches a preset number of times, and obtain the first quantization model. 's test results. The preset number of times can be much smaller than the total number of times of iterative quantization training for the neural network model set previously, for example, the total number of times is 1000 times, and the preset number of times can be any positive integer less than 1000, based on the combination of the test results of the first quantization model Evaluation indicators such as early stopping can terminate the iterative quantitative training of the neural network model in advance, thereby avoiding the problem of overfitting in the final quantitative model.

在另一種可能地實現方式中,模型訓練設備可以在基於預設測試策略確定所述第一量化模型滿足測試條件時,得到第一量化模型,從而對第一量化模型進行測試。其中,測試條件包括但不限於神經網路模型對應的損失函數的變化較小,或者神經網路模型的精度遠小於預設精度要求,或者已經達到預設的迭代量化的總次數等。In another possible implementation manner, the model training device may obtain the first quantitative model when it is determined based on a preset test strategy that the first quantitative model satisfies the test condition, so as to test the first quantitative model. The test conditions include, but are not limited to, the change of the loss function corresponding to the neural network model is small, or the accuracy of the neural network model is far less than the preset accuracy requirement, or the preset total number of iterative quantizations has been reached, etc.

上述實施例中,模型訓練設備可以基於訓練時機,對得到的第一量化模型進行測試,從而得到第一量化模型的測試結果,有利於提高神經網路模型的開發效率。In the above embodiment, the model training device can test the obtained first quantized model based on the training timing, so as to obtain the test result of the first quantized model, which is beneficial to improve the development efficiency of the neural network model.

在一些可選實施例中,本公開還提供了一種圖像處理方法,可以將待處理圖像輸入量化模型,從而得到量化模型輸出的圖像處理結果。In some optional embodiments, the present disclosure also provides an image processing method, which can input an image to be processed into a quantization model, so as to obtain an image processing result output by the quantization model.

其中,該量化模型是通過以上描述的量化訓練方法對神經網路模型進行至少一輪迭代量化訓練得到的量化模型。The quantization model is a quantization model obtained by performing at least one round of iterative quantization training on the neural network model through the quantization training method described above.

待處理圖像可以是視覺任務中所採集到的圖像,通過該量化模型可以對採集到的圖像進行視覺任務分析,其中,視覺任務分析包括但不限於圖像分類、圖像語義分割、人體關鍵點檢測等,可用性高。The image to be processed can be an image collected in a visual task, and the collected image can be analyzed by visual task analysis, wherein the visual task analysis includes but is not limited to image classification, image semantic segmentation, Human key point detection, etc., with high availability.

與前述方法實施例相對應,本公開還提供了裝置的實施例。Corresponding to the foregoing method embodiments, the present disclosure also provides device embodiments.

如圖6所示,圖6是本公開根據一示例性實施例示出的一種量化訓練裝置框圖,裝置包括:第一量化訓練模組210,用於模型訓練設備對神經網路模型進行至少一輪迭代量化訓練,得到第一量化模型;第一測試模組220,用於所述模型訓練設備通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。As shown in FIG. 6, FIG. 6 is a block diagram of a quantization training apparatus according to an exemplary embodiment of the present disclosure. The apparatus includes: a first quantization training module 210, which is used for the model training device to perform at least one round of the neural network model. Iterative quantitative training to obtain a first quantitative model; a first test module 220 for the model training device to test the first quantitative model by simulating a hardware deployment environment to obtain a test of the first quantitative model result.

在一些可選實施例中,所述第一測試模組包括:第一測試子模組,用於所述模型訓練設備通過調用目標函數對所述第一量化模型進行測試,得到所述第一量化模型的測試結果,其中,所述目標函數用於模擬硬體部署環境。In some optional embodiments, the first test module includes: a first test sub-module for the model training device to test the first quantitative model by calling an objective function to obtain the first Quantify the test results of the model, wherein the objective function is used to simulate a hardware deployment environment.

在一些可選實施例中,所述裝置還包括:模型轉換模組,用於對所述第一量化模型進行轉換處理,得到第一測試模型,其中,所述轉換處理包括去除所述第一量化模型的至少一個目標單元,所述目標單元用於對所述神經網路模型的網路層的輸出數據和網路參數中的至少一者進行量化操作和/或去量化操作;所述第一測試模組包括:第二測試子模組,用於所述模型訓練設備通過模擬硬體部署環境,對所述第一測試模型進行測試,得到所述第一量化模型的測試結果。In some optional embodiments, the apparatus further includes: a model conversion module, configured to perform conversion processing on the first quantization model to obtain a first test model, wherein the conversion processing includes removing the first test model. at least one target unit of the quantization model, the target unit is used to perform a quantization operation and/or a dequantization operation on at least one of the output data and network parameters of the network layer of the neural network model; the first A test module includes: a second test sub-module for the model training device to test the first test model by simulating a hardware deployment environment to obtain a test result of the first quantitative model.

在一些可選實施例中,所述轉換處理是通過用於模擬所述硬體部署環境的目標函數實現的。In some optional embodiments, the conversion process is implemented by an objective function for simulating the hardware deployment environment.

在一些可選實施例中,所述第一測試模組包括:第三測試子模組,用於所述模型訓練設備通過模擬硬體部署環境,利用對所述第一量化模型的測試樣本和網路參數進行定點化處理得到的定點數據,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。In some optional embodiments, the first test module includes: a third test sub-module for the model training device to simulate a hardware deployment environment and use the test samples and The fixed-point data obtained by performing fixed-point processing on network parameters is used to test the first quantization model to obtain a test result of the first quantization model.

在一些可選實施例中,所述裝置還包括:第二量化訓練模組,用於所述模型訓練設備對所述第一量化模型進行至少一輪迭代量化訓練,得到第二量化模型;第二測試模組,用於所述模型訓練設備通過模擬硬體部署環境,對所述第二量化模型進行測試,得到所述第二量化模型的測試結果。In some optional embodiments, the apparatus further includes: a second quantization training module, used for the model training device to perform at least one round of iterative quantization training on the first quantization model to obtain a second quantization model; a second quantization model; A test module is used for the model training device to test the second quantization model by simulating a hardware deployment environment to obtain a test result of the second quantization model.

在一些可選實施例中,所述裝置還包括:確定模組,用於至少部分地基於所述第一量化模型的測試結果,得到對所述神經網路模型進行量化訓練的訓練策略分析結果,其中,所述訓練策略分析結果包括下列中的至少一項:終止所述神經網路模型的量化訓練、調整所述神經網路模型中至少一個網路層的量化方式、調整所述神經網路模型的後續迭代的量化訓練方式。In some optional embodiments, the apparatus further includes: a determination module configured to obtain a training strategy analysis result for quantitative training of the neural network model based at least in part on the test result of the first quantitative model , wherein the training strategy analysis result includes at least one of the following: terminating the quantization training of the neural network model, adjusting the quantization method of at least one network layer in the neural network model, adjusting the neural network model Quantized training method for subsequent iterations of the road model.

在一些可選實施例中,所述第二量化訓練模組包括:量化訓練子模組,用於所述模型訓練設備在對所述第一量化模型進行所述測試的過程中,並行地對所述第一量化模型進行至少一輪迭代量化訓練,得到所述第二量化模型。In some optional embodiments, the second quantization training module includes: a quantization training sub-module for the model training device to perform the test on the first quantization model in parallel The first quantization model performs at least one round of iterative quantization training to obtain the second quantization model.

在一些可選實施例中,所述第一測試模組包括以下任一項:第四測試子模組,用於響應於對所述神經網路模型進行至少一輪迭代量化訓練的次數達到預設次數,對得到的所述第一量化模型進行測試,得到所述第一量化模型的測試結果;或者第五測試子模組,用於響應於基於預設測試策略確定所述第一量化模型滿足測試條件,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。In some optional embodiments, the first test module includes any one of the following: a fourth test sub-module, configured to respond to at least one round of iterative quantization training on the neural network model reaching a preset number of times number of times, the obtained first quantitative model is tested to obtain the test result of the first quantitative model; or a fifth test sub-module is used for determining that the first quantitative model meets the requirements based on the preset test strategy. Test conditions, test the first quantitative model to obtain a test result of the first quantitative model.

本公開還提供了一種圖像處理裝置,所述裝置包括:圖像處理模組,用於將待處理圖像輸入量化模型,得到所述量化模型輸出的圖像處理結果;其中,所述量化模型是通過以上描述的量化訓練方法得到的量化模型。The present disclosure also provides an image processing device, the device comprising: an image processing module for inputting an image to be processed into a quantization model to obtain an image processing result output by the quantization model; wherein, the quantization model The model is a quantized model obtained by the quantization training method described above.

對於裝置實施例而言,由於其基本對應於方法實施例,所以相關之處參見方法實施例的部分說明即可。以上所描述的裝置實施例僅僅是示意性的,其中作為分離部件說明的單元可以是或者也可以不是物理上分開的,作為單元顯示的部件可以是或者也可以不是物理單元,即可以位於一個地方,或者也可以分佈到多個網路單元上。可以根據實際的需要選擇其中的部分或者全部模組來實現本公開方案的目的。本領域普通技術人員在不付出創造性勞動的情況下,即可以理解並實施。For the apparatus embodiments, since they basically correspond to the method embodiments, reference may be made to the partial descriptions of the method embodiments for related parts. The device embodiments described above are only illustrative, wherein the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place , or can be distributed over multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present disclosure. Those of ordinary skill in the art can understand and implement it without creative effort.

本公開實施例還提供了一種計算機可讀儲存媒體,儲存媒體儲存有計算機程式,計算機程式用於執行以上描述的量化訓練方法或圖像處理方法。Embodiments of the present disclosure also provide a computer-readable storage medium, where a computer program is stored in the storage medium, and the computer program is used to execute the above-described quantization training method or image processing method.

在一些可選實施例中,本公開實施例提供了一種計算機程式產品,包括計算機可讀代碼,當計算機可讀代碼在設備上運行時,設備中的處理器執行用於實現如上任一實施例提供的量化訓練方法或圖像處理方法的指令。In some optional embodiments, embodiments of the present disclosure provide a computer program product, comprising computer readable code, when the computer readable code is executed on a device, the processor in the device executes to implement any of the above embodiments Provides instructions for quantizing training methods or image processing methods.

該計算機程式產品可以具體通過硬體、軟件或其結合的方式實現。在一個可選實施例中,所述計算機程式產品具體體現為計算機儲存媒體,在另一個可選實施例中,計算機程式產品具體體現為軟件產品,例如軟件開發包(Software Development Kit,SDK)等等。The computer program product can be embodied in hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. Wait.

本公開實施例還提供了一種量化訓練裝置,包括:處理器;用於儲存處理器可執行指令的記憶體;其中,處理器被配置為調用所述記憶體中儲存的可執行指令,實現以上描述的量化訓練方法。An embodiment of the present disclosure further provides a quantization training apparatus, including: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to call the executable instructions stored in the memory to achieve the above The quantized training method is described.

圖7為本公開實施例提供的一種量化訓練裝置的硬體結構示意圖。該神經網路模型的量化訓練裝置310包括處理器311,還可以包括輸入裝置312、輸出裝置313和記憶體314。該輸入裝置312、輸出裝置313、記憶體314和處理器311之間通過匯流排相互連接。FIG. 7 is a schematic diagram of a hardware structure of a quantization training apparatus according to an embodiment of the present disclosure. The quantization training device 310 of the neural network model includes a processor 311 , and may further include an input device 312 , an output device 313 and a memory 314 . The input device 312 , the output device 313 , the memory 314 and the processor 311 are connected to each other through bus bars.

記憶體包括但不限於是隨機存取記憶體(random access memory,RAM)、唯讀記憶體(read-only memory,ROM)、可擦除可編程唯讀記憶體(erasable programmable read only memory,EPROM)、或唯讀記憶光碟(compact disc read-only memory,CD-ROM),該記憶體用於相關指令及數據。Memory includes but is not limited to random access memory (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM) ), or compact disc read-only memory (CD-ROM), which is used for related instructions and data.

輸入裝置用於輸入數據和/或信號,以及輸出裝置用於輸出數據和/或信號。輸出裝置和輸入裝置可以是獨立的器件,也可以是一個整體的器件。Input means are used for inputting data and/or signals, and output means are used for outputting data and/or signals. The output device and the input device can be independent devices or an integral device.

處理器可以包括是一個或多個處理器,例如包括一個或多個中央處理器(central processing unit,CPU),在處理器是一個CPU的情況下,該CPU可以是單核CPU,也可以是多核CPU。The processor may include one or more processors, for example, including one or more central processing units (central processing units, CPUs). In the case where the processor is a CPU, the CPU may be a single-core CPU or a Multi-core CPU.

記憶體用於儲存網路設備的程式代碼和數據。Memory is used to store program code and data for network devices.

處理器用於調用該記憶體中的程式代碼和數據,執行上述方法實施例中的步驟。具體可參見方法實施例中的描述,在此不再贅述。The processor is used for calling the program code and data in the memory to execute the steps in the above method embodiments. For details, refer to the description in the method embodiment, which is not repeated here.

可以理解的是,圖7僅僅示出了一種量化訓練裝置的簡化設計。在實際應用中,該量化訓練裝置還可以分別包含必要的其他元件,包含但不限於任意數量的輸入/輸出裝置、處理器、控制器、記憶體等,而所有可以實現本公開實施例的量化訓練裝置都在本公開的保護範圍之內。It can be understood that FIG. 7 only shows a simplified design of a quantization training apparatus. In practical applications, the quantization training device may also include other necessary elements, including but not limited to any number of input/output devices, processors, controllers, memories, etc., all of which can realize the quantization of the embodiments of the present disclosure. Training devices are within the scope of this disclosure.

本公開實施例還提供了一種圖像處理裝置,包括:處理器;用於儲存處理器可執行指令的記憶體;其中,處理器被配置為調用所述記憶體中儲存的可執行指令,實現以上描述的圖像處理方法。An embodiment of the present disclosure further provides an image processing apparatus, including: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to call the executable instructions stored in the memory to achieve The image processing method described above.

本領域技術人員在考慮說明書及實踐這裡公開的發明後,將容易想到本公開的其它實施方案。本公開旨在涵蓋本公開的任何變型、用途或者適應性變化,這些變型、用途或者適應性變化遵循本公開的一般性原理並包括本公開未公開的本技術領域中的公知常識或者慣用技術手段。說明書和實施例僅被視為示例性的,本公開的真正範圍和精神由下面的權利要求指出。Other embodiments of the present disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common general knowledge or techniques in the technical field not disclosed by this disclosure . The specification and examples are to be regarded as exemplary only, with the true scope and spirit of the disclosure being indicated by the following claims.

以上所述僅為本公開的較佳實施例而已,並不用以限制本公開,凡在本公開的精神和原則之內,所做的任何修改、等同替換、改進等,均應包含在本公開保護的範圍之內。The above descriptions are only preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present disclosure shall be included in the present disclosure. within the scope of protection.

101:對神經網路模型進行至少一輪迭代量化訓練,得到第一量化模型 102:通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果 103:對所述第一量化模型進行轉換處理,得到第一測試模型 104:對所述第一量化模型進行至少一輪迭代量化訓練,得到第二量化模型 105:通過模擬硬體部署環境,對所述第二量化模型進行測試,得到所述第二量化模型的測試結果 106:基於所述第一量化模型的測試結果,得到對所述神經網路模型進行量化訓練的訓練策略分析結果 210:第一量化訓練模組 220:弟一測試模組 310:量化訓練裝置 311:處理器 312:輸入裝置 313:輸出裝置 314:記憶體101: Perform at least one round of iterative quantization training on the neural network model to obtain a first quantization model 102: by simulating a hardware deployment environment, test the first quantitative model to obtain a test result of the first quantitative model 103: converting the first quantization model to obtain the first test model 104: perform at least one round of iterative quantization training on the first quantization model to obtain a second quantization model 105: by simulating a hardware deployment environment, test the second quantization model to obtain a test result of the second quantization model 106: based on the test result of the first quantization model, obtain the training strategy analysis result of the quantization training on the neural network model 210: The first quantitative training module 220: Brother One Test Module 310: Quantization training device 311: Processor 312: Input device 313: Output device 314: Memory

圖1是本公開根據一示例性實施例示出的一種量化訓練方法流程圖。 圖2是本公開根據一示例性實施例示出的另一種量化訓練方法流程圖。 圖3A是本公開根據一示例性實施例示出的一種神經網路模型的架構示意圖。 圖3B是本公開根據一示例性實施例示出的一種第一量化模型的架構示意圖。 圖3C是本公開根據一示例性實施例示出的一種第一測試模型的架構示意圖。 圖4是本公開根據一示例性實施例示出的另一種量化訓練方法流程圖。 圖5是本公開根據一示例性實施例示出的另一種量化訓練方法流程圖。 圖6是本公開根據一示例性實施例示出的一種量化訓練裝置框圖。 圖7是本公開根據一示例性實施例示出的一種量化訓練裝置的一結構示意圖。FIG. 1 is a flowchart of a quantization training method according to an exemplary embodiment of the present disclosure. FIG. 2 is a flowchart of another quantization training method according to an exemplary embodiment of the present disclosure. FIG. 3A is a schematic structural diagram of a neural network model according to an exemplary embodiment of the present disclosure. FIG. 3B is a schematic structural diagram of a first quantization model according to an exemplary embodiment of the present disclosure. FIG. 3C is a schematic structural diagram of a first test model according to an exemplary embodiment of the present disclosure. FIG. 4 is a flowchart of another quantization training method according to an exemplary embodiment of the present disclosure. FIG. 5 is a flowchart of another quantization training method according to an exemplary embodiment of the present disclosure. FIG. 6 is a block diagram of a quantization training apparatus according to an exemplary embodiment of the present disclosure. FIG. 7 is a schematic structural diagram of a quantization training apparatus according to an exemplary embodiment of the present disclosure.

101:對神經網路模型進行至少一輪迭代量化訓練,得到第一量化模型101: Perform at least one round of iterative quantization training on the neural network model to obtain a first quantization model

102:通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果102: by simulating a hardware deployment environment, test the first quantitative model to obtain a test result of the first quantitative model

Claims (15)

一種量化訓練方法,包括: 對神經網路模型進行至少一輪迭代量化訓練,得到第一量化模型; 通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。A quantitative training method, including: Perform at least one round of iterative quantization training on the neural network model to obtain a first quantization model; By simulating a hardware deployment environment, the first quantitative model is tested to obtain a test result of the first quantitative model. 如請求項1所述的量化訓練方法,其中所述通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果,包括: 通過調用目標函數對所述第一量化模型進行測試,得到所述第一量化模型的測試結果,其中,所述目標函數用於模擬硬體部署環境。The quantization training method according to claim 1, wherein the first quantization model is tested by simulating a hardware deployment environment to obtain a test result of the first quantization model, comprising: A test result of the first quantitative model is obtained by calling an objective function to test the first quantitative model, wherein the objective function is used to simulate a hardware deployment environment. 如請求項1或2所述的量化訓練方法,還包括: 對所述第一量化模型進行轉換處理,得到第一測試模型,其中,所述轉換處理包括去除所述第一量化模型的至少一個目標單元,所述目標單元用於對所述神經網路模型的網路層的輸出數據和網路參數中的至少一者進行量化操作和/或去量化操作; 通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果,包括: 通過模擬硬體部署環境,對所述第一測試模型進行測試,得到所述第一量化模型的測試結果。The quantization training method according to claim 1 or 2, further comprising: A conversion process is performed on the first quantization model to obtain a first test model, wherein the conversion process includes removing at least one target unit of the first quantization model, and the target unit is used for the neural network model. at least one of the output data and network parameters of the network layer is subjected to a quantization operation and/or a dequantization operation; By simulating a hardware deployment environment, the first quantitative model is tested to obtain a test result of the first quantitative model, including: By simulating a hardware deployment environment, the first test model is tested to obtain a test result of the first quantitative model. 如請求項3所述的量化訓練方法,其中所述轉換處理是通過用於模擬所述硬體部署環境的目標函數實現的。The quantization training method according to claim 3, wherein the conversion process is realized by an objective function for simulating the hardware deployment environment. 如請求項1或2所述的量化訓練方法,其中所述通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果,包括: 通過模擬硬體部署環境,利用對所述第一量化模型的測試樣本和網路參數進行定點化處理得到的定點數據,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。The quantitative training method according to claim 1 or 2, wherein the first quantitative model is tested by simulating a hardware deployment environment to obtain a test result of the first quantitative model, including: By simulating the hardware deployment environment, using the fixed-point data obtained by performing fixed-point processing on the test samples and network parameters of the first quantitative model, the first quantitative model is tested to obtain the test of the first quantitative model. result. 如請求項1或2所述的量化訓練方法,還包括: 對所述第一量化模型進行至少一輪迭代量化訓練,得到第二量化模型; 通過模擬硬體部署環境,對所述第二量化模型進行測試,得到所述第二量化模型的測試結果。The quantization training method according to claim 1 or 2, further comprising: Perform at least one round of iterative quantization training on the first quantization model to obtain a second quantization model; By simulating a hardware deployment environment, the second quantization model is tested to obtain a test result of the second quantization model. 如請求項1或2所述的量化訓練方法,還包括: 基於所述第一量化模型的測試結果,得到對所述神經網路模型進行量化訓練的訓練策略分析結果,其中,所述訓練策略分析結果包括下列中的至少一項:終止所述神經網路模型的量化訓練、調整所述神經網路模型中至少一個網路層的量化方式、調整所述神經網路模型的後續迭代的量化訓練方式。The quantization training method according to claim 1 or 2, further comprising: Based on the test result of the first quantitative model, a training strategy analysis result of performing quantitative training on the neural network model is obtained, wherein the training strategy analysis result includes at least one of the following: terminating the neural network Quantization training of the model, adjusting the quantization method of at least one network layer in the neural network model, and adjusting the quantization training method of subsequent iterations of the neural network model. 如請求項6所述的量化訓練方法,其中所述對所述第一量化模型進行至少一輪迭代量化訓練,得到第二量化模型,包括: 在對所述第一量化模型進行所述測試的過程中,並行地對所述第一量化模型進行至少一輪迭代量化訓練,得到所述第二量化模型。The quantization training method according to claim 6, wherein performing at least one round of iterative quantization training on the first quantization model to obtain a second quantization model, comprising: In the process of performing the test on the first quantization model, at least one round of iterative quantization training is performed on the first quantization model in parallel to obtain the second quantization model. 如請求項1或2所述的量化訓練方法,其中所述對所述第一量化模型進行測試,得到所述第一量化模型的測試結果,包括以下任一項: 響應於對所述神經網路模型進行至少一輪迭代量化訓練的次數達到預設次數,對得到的所述第一量化模型進行測試,得到所述第一量化模型的測試結果;或者 響應於基於預設測試策略確定所述第一量化模型滿足測試條件,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。The quantization training method according to claim 1 or 2, wherein the test is performed on the first quantization model to obtain a test result of the first quantization model, including any one of the following: In response to performing at least one round of iterative quantization training on the neural network model reaching a preset number of times, testing the obtained first quantization model to obtain a test result of the first quantization model; or In response to determining that the first quantitative model satisfies the test condition based on the preset test strategy, the first quantitative model is tested to obtain a test result of the first quantitative model. 一種圖像處理方法,包括: 將待處理圖像輸入第一量化模型,得到所述第一量化模型輸出的圖像處理結果;其中,所述第一量化模型是通過權利要求1-9任一項所述的量化訓練方法得到的量化模型。An image processing method, comprising: Input the image to be processed into the first quantization model, and obtain the image processing result output by the first quantization model; wherein, the first quantization model is obtained by the quantization training method described in any one of claims 1-9 quantification model. 一種量化訓練裝置,包括: 第一量化訓練模組,用於對神經網路模型進行至少一輪迭代量化訓練,得到第一量化模型; 第一測試模組,用於通過模擬硬體部署環境,對所述第一量化模型進行測試,得到所述第一量化模型的測試結果。A quantitative training device, comprising: a first quantization training module, used to perform at least one round of iterative quantization training on the neural network model to obtain a first quantization model; The first test module is used to test the first quantitative model by simulating a hardware deployment environment to obtain a test result of the first quantitative model. 一種圖像處理裝置,包括: 圖像處理模組,用於將待處理圖像輸入量化模型,得到所述量化模型輸出的圖像處理結果;其中,所述量化模型是通過權利要求1-9任一項所述的量化訓練方法訓練得到的量化模型。An image processing device, comprising: The image processing module is used to input the image to be processed into the quantization model, and obtain the image processing result output by the quantization model; wherein, the quantization model is trained through the quantization of any one of claims 1-9 The quantized model trained by the method. 一種計算機可讀儲存媒體,其中所述儲存媒體儲存有計算機程式,所述計算機程式用於執行上述權利要求1-9任一項所述的量化訓練方法或上述權利要求10所述的圖像處理方法。A computer-readable storage medium, wherein the storage medium stores a computer program, and the computer program is used to execute the quantization training method described in any one of the above claims 1-9 or the image processing described in the above claim 10 method. 一種量化訓練裝置,包括: 處理器; 用於儲存所述處理器可執行指令的記憶體; 其中,所述處理器被配置為調用所述記憶體中儲存的可執行指令,實現權利要求1-9中任一項所述的量化訓練方法。A quantitative training device, comprising: processor; memory for storing said processor-executable instructions; Wherein, the processor is configured to call executable instructions stored in the memory to implement the quantization training method according to any one of claims 1-9. 一種圖像處理裝置,包括: 處理器; 用於儲存所述處理器可執行指令的記憶體; 其中,所述處理器被配置為調用所述記憶體中儲存的可執行指令,實現權利要求10所述的圖像處理方法。An image processing device, comprising: processor; memory for storing said processor-executable instructions; Wherein, the processor is configured to invoke the executable instructions stored in the memory to implement the image processing method of claim 10 .
TW110117531A 2020-05-21 2021-05-14 Method and apparatus of quantization training, image processing, and storage medium TW202145142A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010434807.1A CN111598237A (en) 2020-05-21 2020-05-21 Quantization training method, image processing device, and storage medium
CN202010434807.1 2020-05-21

Publications (1)

Publication Number Publication Date
TW202145142A true TW202145142A (en) 2021-12-01

Family

ID=72185991

Family Applications (1)

Application Number Title Priority Date Filing Date
TW110117531A TW202145142A (en) 2020-05-21 2021-05-14 Method and apparatus of quantization training, image processing, and storage medium

Country Status (5)

Country Link
JP (1) JP2022540298A (en)
KR (1) KR20220013946A (en)
CN (1) CN111598237A (en)
TW (1) TW202145142A (en)
WO (1) WO2021233069A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI795135B (en) * 2021-12-22 2023-03-01 財團法人工業技術研究院 Quantization method for neural network model and deep learning accelerator

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598237A (en) * 2020-05-21 2020-08-28 上海商汤智能科技有限公司 Quantization training method, image processing device, and storage medium
CN112446491B (en) * 2021-01-20 2024-03-15 上海齐感电子信息科技有限公司 Real-time automatic quantification method and real-time automatic quantification system for neural network model
CN112884144A (en) * 2021-02-01 2021-06-01 上海商汤智能科技有限公司 Network quantization method and device, electronic equipment and storage medium
CN112801303A (en) * 2021-02-07 2021-05-14 中兴通讯股份有限公司 Intelligent pipeline processing method and device, storage medium and electronic device
CN113011581B (en) * 2021-02-23 2023-04-07 北京三快在线科技有限公司 Neural network model compression method and device, electronic equipment and readable storage medium
CN113762503A (en) * 2021-05-27 2021-12-07 腾讯云计算(北京)有限责任公司 Data processing method, device, equipment and computer readable storage medium
CN113762403B (en) * 2021-09-14 2023-09-05 杭州海康威视数字技术股份有限公司 Image processing model quantization method, device, electronic equipment and storage medium
CN115496200B (en) * 2022-09-05 2023-09-22 中国科学院半导体研究所 Neural network quantization model training method, device and equipment

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190340499A1 (en) * 2018-05-04 2019-11-07 Microsoft Technology Licensing, Llc Quantization for dnn accelerators
CN110555508B (en) * 2018-05-31 2022-07-12 赛灵思电子科技(北京)有限公司 Artificial neural network adjusting method and device
CN109165730B (en) * 2018-09-05 2022-04-26 电子科技大学 State quantization network implementation method in cross array neuromorphic hardware
GB2580171B (en) * 2018-12-21 2021-02-17 Imagination Tech Ltd Methods and systems for selecting quantisation parameters for deep neural networks using back-propagation
CN110097186B (en) * 2019-04-29 2023-04-18 山东浪潮科学研究院有限公司 Neural network heterogeneous quantitative training method
CN110135582B (en) * 2019-05-09 2022-09-27 北京市商汤科技开发有限公司 Neural network training method, neural network training device, image processing method, image processing device and storage medium
CN110334802A (en) * 2019-05-23 2019-10-15 腾讯科技(深圳)有限公司 A kind of construction method of neural network model, device, equipment and storage medium
CN110188880A (en) * 2019-06-03 2019-08-30 四川长虹电器股份有限公司 A kind of quantization method and device of deep neural network
CN110414679A (en) * 2019-08-02 2019-11-05 厦门美图之家科技有限公司 Model training method, device, electronic equipment and computer readable storage medium
CN111598237A (en) * 2020-05-21 2020-08-28 上海商汤智能科技有限公司 Quantization training method, image processing device, and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI795135B (en) * 2021-12-22 2023-03-01 財團法人工業技術研究院 Quantization method for neural network model and deep learning accelerator

Also Published As

Publication number Publication date
KR20220013946A (en) 2022-02-04
CN111598237A (en) 2020-08-28
JP2022540298A (en) 2022-09-15
WO2021233069A1 (en) 2021-11-25

Similar Documents

Publication Publication Date Title
TW202145142A (en) Method and apparatus of quantization training, image processing, and storage medium
EP3754495B1 (en) Data processing method and related products
TWI806922B (en) Method and apparatus for quantizing artificial neural network, and method of quantizing floating-point neural network
EP3619652B1 (en) Adaptive bit-width reduction for neural networks
US20180121796A1 (en) Flexible neural network accelerator and methods therefor
US20220283820A1 (en) Data parallelism in distributed training of artificial intelligence models
US11520592B2 (en) Executing large artificial intelligence models on memory-constrained devices
US11354579B2 (en) Dynamic multi-layer execution for artificial intelligence modeling
CN112257858A (en) Model compression method and device
CN114004352B (en) Simulation implementation method, neural network compiler and computer readable storage medium
Zhao et al. Cambricon-Q: A hybrid architecture for efficient training
JP2023552048A (en) Neural architecture scaling for hardware acceleration
CN113554097B (en) Model quantization method and device, electronic equipment and storage medium
CN114595627A (en) Model quantization method, device, equipment and storage medium
Ashby et al. Exploiting unstructured sparsity on next-generation datacenter hardware
CN117931211A (en) Model deployment method, device, apparatus, chip and storage medium
CN112614509B (en) Method and device for generating voice evaluation installation package and executing voice evaluation
CN117725705B (en) Rib shape optimization method, device, equipment and storage medium
CN116306879A (en) Data processing method, device, electronic equipment and storage medium
Cunningham et al. Efficient Training and Inference: Techniques for Large Language Models Using Llama
CN115496207A (en) Neural network model compression method, device and system
CN117952180A (en) Training method, data prediction method, device and equipment for multi-task model
Huang et al. Quantization and Deployment Study of Classification Models for Embedded Platforms
Yoon et al. Development of Edge AI Device Platform for Application Developer
Johnson et al. A Non-Linear GPU Performance Modeling Approach and Consolidated Linear Hardware Model Performance Evaluation of the LEAP Cluster