JP7408799B2 - ニューラルネットワークモデルの圧縮 - Google Patents
ニューラルネットワークモデルの圧縮 Download PDFInfo
- Publication number
- JP7408799B2 JP7408799B2 JP2022527688A JP2022527688A JP7408799B2 JP 7408799 B2 JP7408799 B2 JP 7408799B2 JP 2022527688 A JP2022527688 A JP 2022527688A JP 2022527688 A JP2022527688 A JP 2022527688A JP 7408799 B2 JP7408799 B2 JP 7408799B2
- Authority
- JP
- Japan
- Prior art keywords
- neural network
- layer
- model
- tensor
- coefficients
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003062 neural network model Methods 0.000 title description 56
- 238000007906 compression Methods 0.000 title description 24
- 230000006835 compression Effects 0.000 title description 22
- 238000000034 method Methods 0.000 claims description 178
- 238000013528 artificial neural network Methods 0.000 claims description 96
- 238000013139 quantization Methods 0.000 claims description 90
- 230000001419 dependent effect Effects 0.000 claims description 50
- 238000012545 processing Methods 0.000 claims description 30
- 230000004044 response Effects 0.000 claims description 7
- 230000008569 process Effects 0.000 description 71
- 238000010606 normalization Methods 0.000 description 18
- 238000012549 training Methods 0.000 description 14
- 210000002569 neuron Anatomy 0.000 description 13
- 230000009467 reduction Effects 0.000 description 12
- 239000013598 vector Substances 0.000 description 9
- 239000011159 matrix material Substances 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 7
- 238000000638 solvent extraction Methods 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 4
- 230000006837 decompression Effects 0.000 description 4
- 238000007667 floating Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000013507 mapping Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 230000001537 neural effect Effects 0.000 description 4
- 230000002093 peripheral effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 238000009966 trimming Methods 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 210000004556 brain Anatomy 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000011664 signaling Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 101710092887 Integrator complex subunit 4 Proteins 0.000 description 1
- 102100030148 Integrator complex subunit 8 Human genes 0.000 description 1
- 101710092891 Integrator complex subunit 8 Proteins 0.000 description 1
- 102100037075 Proto-oncogene Wnt-3 Human genes 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000010079 rubber tapping Methods 0.000 description 1
- 239000000779 smoke Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 210000000225 synapse Anatomy 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000004580 weight loss Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/60—General implementation details not specific to a particular type of compression
- H03M7/6005—Decoder aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0495—Quantised networks; Sparse networks; Compressed networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3059—Digital compression and data reduction techniques where the original information is represented by a subset or similar information, e.g. lossy compression
- H03M7/3064—Segmenting
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3082—Vector coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/184—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/96—Tree coding, e.g. quad-tree coding
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Description
ここで、kは、関連するパラメータレベル(伝送される量子化インデックス)を示す。
ここで、sgn(.)は、以下の符号関数を示す。
ここで、sttabは、表1を示す。
ここで、sは、LSAにおけるスケール因子を示し、Wは、重みテンソルを示し、Xはソースデータを示し、bはバイアスを示し、γ、σ、μ及びβは、batchnormパラメータであり、
は、取得したスケール因子を示し、且つ
は、取得したバイアスを示す。したがって、この場合、γ、σ、μ及びβを使用して、sではなく、α及びδを信号で伝送することができる。
ここで、λR≧0はデータ損失と正則化損失の寄与をバランスするハイパーパラメータである。
ここで、λU≧0は、オリジナル訓練ターゲットと重み統一の寄与をバランスするためのハイパーパラメータである。式11の£(D|Θ)を共同で最適化することで、重み係数の最適なセットを取得でき、これによって、更なる圧縮の有効性に大きく寄与する。また、式11の重み統一損失は、畳み込み演算が一般的な行列乗算(general matrix multiplication、GEMM)プロセスとして実行される基本的なプロセスを考慮に入れることで、計算を大幅に高速化することができる最適化された重み係数を生成する。なお、重み統一損失は、一般的な正則化をする場合(λR>0場合)又は有しない場合(λR=0場合)の一般的なターゲット損失に対する追加正則化項と見なされる。また、当該方法は柔軟に、任意の正則化損失£R(Θ)に適用されることができる。
ここで、LU(Wj)は、j番目の層で定義される統一損失であり、Nは、量子化損失が測定される総層数であり、Wjは、j番目の層の重み係数を示す。また、LU(Wj)は、各層に対して個別に計算されるため、本開示の他の部分において、スクリプトjは、その一般性を失うことなく、省略される。
によって、出力Bを計算する。
を生成する。グラウンドトゥルースアノテーション(ground-truth annotation)y及び推定出力
に基づき、ターゲット損失計算プロセスで、式11におけるターゲット訓練損失£T(D|Θ)を計算することができる。
付録:頭字語
DNN:深層ニューラルネットワーク
NNR:ニューラルネットワークのコーディングされた表現
CTU:コーディングツリーユニット
CTU3D:3次元コーディングツリーユニット
CU:コーディングユニット
CU3D:3次元コーディングユニット
RD:レート歪み
VVC:多用途ビデオコーディング
Claims (5)
- 復号器でニューラルネットワークを復号する方法であって、
ニューラルネットワークの圧縮表現のビットストリームから、依存量子化有効化フラグを受信するステップであって、前記依存量子化有効化フラグが、依存量子化方法を前記ニューラルネットワークのモデルパラメータに適用するかどうかを示すステップと、
前記依存量子化有効化フラグが、前記依存量子化方法を使用して前記ニューラルネットワークのモデルパラメータを符号化することを示すことに応答して、前記依存量子化方法に基づき、前記ニューラルネットワークのモデルパラメータを再構成するステップと、を含む方法。 - モデルレベル、層レベル、サブ層レベル、3次元コーディングユニット(CU3D)レベル、又は3次元コーディングツリーユニット(CTU3D)レベルで、前記依存量子化有効化フラグを信号で伝送する請求項1に記載の方法。
- 前記依存量子化有効化フラグが、均一量子化方法を使用して前記ニューラルネットワークのモデルパラメータを符号化することを示すことに応答して、前記均一量子化方法に基づき、前記ニューラルネットワークのモデルパラメータを再構成するステップをさらに含む請求項1又は2に記載の方法。
- メモリと、処理回路とを含む復号器であって、
前記処理回路は、前記メモリに記憶されたプログラムを実行することにより、請求項1乃至3のいずれか1項に記載の方法を実行する復号器。 - プロセッサーに、請求項1乃至3のいずれか1項に記載の方法を実行させるためのプログラム。
Applications Claiming Priority (11)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202063011122P | 2020-04-16 | 2020-04-16 | |
US63/011,122 | 2020-04-16 | ||
US202063011908P | 2020-04-17 | 2020-04-17 | |
US63/011,908 | 2020-04-17 | ||
US202063042968P | 2020-06-23 | 2020-06-23 | |
US63/042,968 | 2020-06-23 | ||
US202063052368P | 2020-07-15 | 2020-07-15 | |
US63/052,368 | 2020-07-15 | ||
US17/225,486 | 2021-04-08 | ||
US17/225,486 US20210326710A1 (en) | 2020-04-16 | 2021-04-08 | Neural network model compression |
PCT/US2021/026995 WO2021211522A1 (en) | 2020-04-16 | 2021-04-13 | Neural network model compression |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2023505647A JP2023505647A (ja) | 2023-02-10 |
JP7408799B2 true JP7408799B2 (ja) | 2024-01-05 |
Family
ID=78082687
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2022527688A Active JP7408799B2 (ja) | 2020-04-16 | 2021-04-13 | ニューラルネットワークモデルの圧縮 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20210326710A1 (ja) |
EP (1) | EP4011071A4 (ja) |
JP (1) | JP7408799B2 (ja) |
KR (1) | KR20220058628A (ja) |
CN (1) | CN114402596A (ja) |
WO (1) | WO2021211522A1 (ja) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11037330B2 (en) * | 2017-04-08 | 2021-06-15 | Intel Corporation | Low rank matrix compression |
US11948090B2 (en) * | 2020-03-06 | 2024-04-02 | Tencent America LLC | Method and apparatus for video coding |
US20220108488A1 (en) * | 2020-10-07 | 2022-04-07 | Qualcomm Incorporated | Angular mode and in-tree quantization in geometry point cloud compression |
KR102572828B1 (ko) | 2022-02-10 | 2023-08-31 | 주식회사 노타 | 신경망 모델을 획득하는 방법 및 이를 수행하는 전자 장치 |
CN114723033B (zh) * | 2022-06-10 | 2022-08-19 | 成都登临科技有限公司 | 数据处理方法、装置、ai芯片、电子设备及存储介质 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190094477A1 (en) | 2013-11-22 | 2019-03-28 | Sony Corporation | Optical communication device, reception apparatus, transmission apparatus, and transmission and reception system |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102124714B1 (ko) * | 2015-09-03 | 2020-06-19 | 미디어텍 인크. | 비디오 코딩에서의 신경망 기반 프로세싱의 방법 및 장치 |
US11107461B2 (en) * | 2016-06-01 | 2021-08-31 | Massachusetts Institute Of Technology | Low-power automatic speech recognition device |
US10643124B2 (en) * | 2016-08-12 | 2020-05-05 | Beijing Deephi Intelligent Technology Co., Ltd. | Method and device for quantizing complex artificial neural network |
US11593632B2 (en) * | 2016-12-15 | 2023-02-28 | WaveOne Inc. | Deep learning based on image encoding and decoding |
JP2021519546A (ja) * | 2018-03-29 | 2021-08-10 | フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン | 映像符号化のための候補変換セットの決定 |
-
2021
- 2021-04-08 US US17/225,486 patent/US20210326710A1/en active Pending
- 2021-04-13 WO PCT/US2021/026995 patent/WO2021211522A1/en unknown
- 2021-04-13 CN CN202180005390.XA patent/CN114402596A/zh active Pending
- 2021-04-13 JP JP2022527688A patent/JP7408799B2/ja active Active
- 2021-04-13 KR KR1020227011926A patent/KR20220058628A/ko active Search and Examination
- 2021-04-13 EP EP21788018.6A patent/EP4011071A4/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190094477A1 (en) | 2013-11-22 | 2019-03-28 | Sony Corporation | Optical communication device, reception apparatus, transmission apparatus, and transmission and reception system |
Also Published As
Publication number | Publication date |
---|---|
WO2021211522A1 (en) | 2021-10-21 |
EP4011071A4 (en) | 2023-04-26 |
JP2023505647A (ja) | 2023-02-10 |
KR20220058628A (ko) | 2022-05-09 |
EP4011071A1 (en) | 2022-06-15 |
CN114402596A (zh) | 2022-04-26 |
US20210326710A1 (en) | 2021-10-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7408799B2 (ja) | ニューラルネットワークモデルの圧縮 | |
CN112437930A (zh) | 以熟练的推理速度和功耗,生成神经网络的压缩表示 | |
US11671110B2 (en) | Method and apparatus for neural network model compression/decompression | |
US11948090B2 (en) | Method and apparatus for video coding | |
US20230026190A1 (en) | Signaling of coding tree unit block partitioning in neural network model compression | |
WO2024011426A1 (zh) | 一种点云几何数据增强、编解码方法、装置和系统 | |
US20230306239A1 (en) | Online training-based encoder tuning in neural image compression | |
US20230316588A1 (en) | Online training-based encoder tuning with multi model selection in neural image compression | |
US20230316048A1 (en) | Multi-rate computer vision task neural networks in compression domain | |
US20230336738A1 (en) | Multi-rate of computer vision task neural networks in compression domain | |
US20230334718A1 (en) | Online training computer vision task models in compression domain |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20220614 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20230822 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20231121 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20231128 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20231220 |
|
R150 | Certificate of patent or registration of utility model |
Ref document number: 7408799 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |