JPWO2022201352A5 - - Google Patents
Download PDFInfo
- Publication number
- JPWO2022201352A5 JPWO2022201352A5 JP2023508251A JP2023508251A JPWO2022201352A5 JP WO2022201352 A5 JPWO2022201352 A5 JP WO2022201352A5 JP 2023508251 A JP2023508251 A JP 2023508251A JP 2023508251 A JP2023508251 A JP 2023508251A JP WO2022201352 A5 JPWO2022201352 A5 JP WO2022201352A5
- Authority
- JP
- Japan
- Prior art keywords
- quantization
- feature data
- inference
- quantized
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Description
本開示に係る推論装置は、
推論用データを用いて機械学習の手法に基づいた少なくとも1回の量子化演算を実行する量子化推論部と、
前記推論用データを用いて前記少なくとも1回の量子化演算それぞれに対応する少なくとも1回の非量子化演算の少なくともいずれかを実行する非量子化推論部と
を備える推論装置であって、
前記少なくとも1回の量子化演算それぞれは、前記少なくとも1回の量子化演算それぞれの特徴を示す少なくとも1つの量子化特徴データそれぞれに応じた演算であり、
前記少なくとも1回の非量子化演算それぞれは、前記少なくとも1回の非量子化演算それぞれの特徴を示す少なくとも1つの非量子化特徴データそれぞれに応じた演算であり、
前記推論装置は、さらに、
前記少なくとも1回の量子化演算の少なくともいずれかにおいてオーバーフローが発生した場合に、オーバーフローが発生した量子化演算の各々に対応する量子化特徴データと、オーバーフローが発生した量子化演算の各々に対応する非量子化演算に対応する非量子化特徴データとを抽出するデータ抽出部
を備える。
The reasoning device according to the present disclosure is
a quantization inference unit that uses the inference data to perform at least one quantization operation based on machine learning techniques;
a non-quantization inference unit that performs at least one of at least one non-quantization operation corresponding to each of the at least one quantization operation using the inference data, the inference device comprising:
each of the at least one quantization operation is an operation corresponding to each of at least one quantization feature data indicating a feature of each of the at least one quantization operation;
each of the at least one unquantized operation is an operation corresponding to each of at least one piece of unquantized feature data representing a feature of each of the at least one unquantized operation;
The reasoning device further
When overflow occurs in at least one of the at least one quantization operations, quantization feature data corresponding to each of the quantization operations in which overflow has occurred, and quantization feature data corresponding to each of the quantization operations in which overflow has occurred. A data extraction unit for extracting non-quantized feature data corresponding to the non-quantized operation.
本開示によれば、機械学習の手法に基づいた推論においてオーバーフローが発生した場合に、データ抽出部が、発生したオーバーフローに関係のある量子化特徴データと非量子化特徴データとを抽出する。ここで、量子化特徴データと非量子化特徴データとの各々は推論用データとは異なるデータである。そのため、本開示によれば、ある推論用データを用いて推論を実行した際にオーバーフローが発生した場合において、発生したオーバーフローを解析するためのデータであって当該ある推論用データとは異なるデータを取得することができる。 According to the present disclosure, when an overflow occurs in inference based on a machine learning technique , the data extraction unit extracts quantized feature data and non-quantized feature data related to the overflow that has occurred. . Here, each of the quantized feature data and the non-quantized feature data is data different from the inference data. Therefore, according to the present disclosure, when an overflow occurs when inference is executed using certain inference data, data for analyzing the overflow that has occurred and is different from the inference data can be obtained.
データ抽出部130は、量子化推論プロセスにおいてオーバーフローが発生した場合に、量子化推論プロセスと非量子化推論プロセスとから退避データを抽出する。退避データは、量子化推論プロセスにおけるオーバーフローを解析する際に用いられるデータであり、具体例として、入力データと特徴データとから成る。入力データは、少なくとも1回の量子化演算の各々と少なくとも1回の非量子化演算の各々とを実行する際に入力されるデータである。特徴データは、量子化演算において活用されるデータである。特徴データは、量子化特徴データと非量子化特徴データとの総称であり、少なくとも1回の量子化演算と少なくとも1回の非量子化演算との各々の演算ごとに存在し、具体例として演算における入力データの振れ幅を表す。入力データの振れ幅は、入力データとして想定されるデータが示す値の最小値から最大値までの範囲である。データ抽出部130は、少なくとも1回の量子化演算の少なくともいずれかにおいてオーバーフローが発生した場合に、オーバーフローが発生した量子化演算の各々に対応する量子化特徴データと、オーバーフローが発生した量子化演算の各々に対応する非量子化演算に対応する非量子化特徴データとを抽出する。
機械学習の手法がディープラーニングである場合において、データ抽出部130は、量子化特徴データとしてオーバーフローが発生した量子化演算に対応するレイヤについてのパラメータを示すデータを抽出してもよく、非量子化特徴データとしてオーバーフローが発生した量子化演算に対応する非量子化演算に対応するレイヤについてのパラメータを示すデータを抽出してもよい。
The data extraction unit 130 extracts saved data from the quantization inference process and the non-quantization inference process when an overflow occurs in the quantization inference process. The saved data is data used when analyzing overflow in the quantization inference process, and as a specific example, consists of input data and feature data. Input data is data that is input in performing each of the at least one quantized operation and each of the at least one unquantized operation. Feature data is data utilized in the quantization operation. Feature data is a general term for quantized feature data and non-quantized feature data, and exists for each of at least one quantized operation and at least one non-quantized operation. represents the amplitude of the input data at The amplitude of the input data is the range from the minimum value to the maximum value indicated by the data assumed as the input data. When an overflow occurs in at least one of at least one quantization operation, the data extracting unit 130 extracts quantization feature data corresponding to each of the quantization operations in which the overflow occurs, and the quantization operation in which the overflow occurs. and unquantized feature data corresponding to unquantized operations corresponding to each of .
When the machine learning technique is deep learning , the data extraction unit 130 may extract, as the quantization feature data, data indicating parameters for the layer corresponding to the quantization operation in which the overflow occurred. As the quantization feature data, data representing parameters for a layer corresponding to a non-quantization operation corresponding to a quantization operation in which overflow has occurred may be extracted.
Claims (6)
前記推論用データを用いて前記少なくとも1回の量子化演算それぞれに対応する少なくとも1回の非量子化演算の少なくともいずれかを実行する非量子化推論部と
を備える推論装置であって、
前記少なくとも1回の量子化演算それぞれは、前記少なくとも1回の量子化演算それぞれの特徴を示す少なくとも1つの量子化特徴データそれぞれに応じた演算であり、
前記少なくとも1回の非量子化演算それぞれは、前記少なくとも1回の非量子化演算それぞれの特徴を示す少なくとも1つの非量子化特徴データそれぞれに応じた演算であり、
前記推論装置は、さらに、
前記少なくとも1回の量子化演算の少なくともいずれかにおいてオーバーフローが発生した場合に、オーバーフローが発生した量子化演算の各々に対応する量子化特徴データと、オーバーフローが発生した量子化演算の各々に対応する非量子化演算に対応する非量子化特徴データとを抽出するデータ抽出部
を備える推論装置。 a quantization inference unit that uses the inference data to perform at least one quantization operation based on machine learning techniques;
a non-quantization inference unit that performs at least one of at least one non-quantization operation corresponding to each of the at least one quantization operation using the inference data, the inference device comprising:
each of the at least one quantization operation is an operation corresponding to each of at least one quantization feature data indicating a feature of each of the at least one quantization operation;
each of the at least one unquantized operation is an operation corresponding to each of at least one piece of unquantized feature data representing a feature of each of the at least one unquantized operation;
The reasoning device further
When overflow occurs in at least one of the at least one quantization operations, quantization feature data corresponding to each of the quantization operations in which overflow has occurred, and quantization feature data corresponding to each of the quantization operations in which overflow has occurred. An inference apparatus comprising a data extraction unit for extracting non-quantized feature data corresponding to non-quantized operations.
前記少なくとも1つの非量子化特徴データそれぞれは、前記少なくとも1回の非量子化演算それぞれに対応するパラメータを含む請求項1に記載の推論装置。 each of the at least one quantized feature data includes parameters corresponding to each of the at least one quantization operation;
2. The reasoning apparatus of claim 1, wherein each of the at least one unquantized feature data includes parameters corresponding to each of the at least one unquantized operation.
前記データ抽出部は、
前記量子化特徴データとして、オーバーフローが発生した量子化演算に対応するレイヤについてのパラメータを示すデータを抽出し、
前記非量子化特徴データとして、オーバーフローが発生した量子化演算に対応する非量子化演算に対応するレイヤについてのパラメータを示すデータを抽出する請求項1又は2に記載の推論装置。 The machine learning method is deep learning,
The data extractor is
extracting, as the quantization feature data, data indicating a parameter for a layer corresponding to a quantization operation in which an overflow has occurred;
3. The reasoning apparatus according to claim 1, wherein, as said non-quantized feature data, data indicating parameters for a layer corresponding to a non-quantized operation corresponding to a quantized operation in which an overflow has occurred is extracted.
抽出された量子化特徴データと非量子化特徴データとに基づいて、抽出された量子化特徴データに対応する量子化演算である対象演算においてオーバーフローが発生しないよう、前記対象演算に対応する量子化特徴データを変更する再量子化部を
備え、
前記量子化推論部は、変更された量子化特徴データに応じた量子化演算を実行する請求項1から3のいずれか1項に記載の推論装置。 The reasoning device further
quantization corresponding to the target operation, which is the quantization operation corresponding to the extracted quantized feature data, based on the extracted quantized feature data and the non-quantized feature data, so that overflow does not occur in the target operation; a requantizer that modifies the feature data,
4. The reasoning apparatus according to any one of claims 1 to 3, wherein the quantization reasoning unit executes a quantization operation according to changed quantization feature data.
前記コンピュータが、前記推論用データを用いて前記少なくとも1回の量子化演算それぞれに対応する少なくとも1回の非量子化演算の少なくともいずれかを実行する推論方法であって、
前記少なくとも1回の量子化演算それぞれは、前記少なくとも1回の量子化演算それぞれの特徴を示す少なくとも1つの量子化特徴データそれぞれに応じた演算であり、
前記少なくとも1回の非量子化演算それぞれは、前記少なくとも1回の非量子化演算それぞれの特徴を示す少なくとも1つの非量子化特徴データそれぞれに応じた演算であり、
前記コンピュータが、前記少なくとも1回の量子化演算の少なくともいずれかにおいてオーバーフローが発生した場合に、オーバーフローが発生した量子化演算の各々に対応する量子化特徴データと、オーバーフローが発生した量子化演算の各々に対応する非量子化演算に対応する非量子化特徴データとを抽出する推論方法。 a computer performs at least one quantization operation based on machine learning techniques using the inference data;
an inference method in which the computer performs at least one unquantized operation corresponding to each of the at least one quantized operation using the inference data,
each of the at least one quantization operation is an operation corresponding to each of at least one quantization feature data indicating a feature of each of the at least one quantization operation;
each of the at least one unquantized operation is an operation corresponding to each of at least one piece of unquantized feature data representing a feature of each of the at least one unquantized operation;
When an overflow occurs in at least one of the at least one quantization operations, the computer provides quantization feature data corresponding to each of the quantization operations in which the overflow occurred, and quantization feature data corresponding to each of the quantization operations in which the overflow occurred. An inference method for extracting unquantized feature data corresponding to each corresponding unquantized operation.
前記推論用データを用いて前記少なくとも1回の量子化演算それぞれに対応する少なくとも1回の非量子化演算の少なくともいずれかを実行する非量子化推論処理と
をコンピュータである推論装置に実行させる推論プログラムであって、
前記少なくとも1回の量子化演算それぞれは、前記少なくとも1回の量子化演算それぞれの特徴を示す少なくとも1つの量子化特徴データそれぞれに応じた演算であり、
前記少なくとも1回の非量子化演算それぞれは、前記少なくとも1回の非量子化演算それぞれの特徴を示す少なくとも1つの非量子化特徴データそれぞれに応じた演算であり、
前記推論プログラムは、さらに、
前記少なくとも1回の量子化演算の少なくともいずれかにおいてオーバーフローが発生した場合に、オーバーフローが発生した量子化演算の各々に対応する量子化特徴データと、オーバーフローが発生した量子化演算の各々に対応する非量子化演算に対応する非量子化特徴データとを抽出するデータ抽出処理を前記推論装置に実行させる推論プログラム。 quantization inference processing that performs at least one quantization operation based on machine learning techniques using inference data;
making an inference device, which is a computer, perform non-quantized inference processing for executing at least one of at least one non-quantized operation corresponding to each of the at least one quantized operation using the inference data; a program,
each of the at least one quantization operation is an operation corresponding to each of at least one quantization feature data indicating a feature of each of the at least one quantization operation;
each of the at least one unquantized operation is an operation corresponding to each of at least one piece of unquantized feature data representing a feature of each of the at least one unquantized operation;
The inference program further
When overflow occurs in at least one of the at least one quantization operations, quantization feature data corresponding to each of the quantization operations in which overflow has occurred, and quantization feature data corresponding to each of the quantization operations in which overflow has occurred. An inference program that causes the inference device to execute data extraction processing for extracting non-quantized feature data corresponding to non-quantized operations.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/012193 WO2022201352A1 (en) | 2021-03-24 | 2021-03-24 | Inference device, inference method, and inference program |
Publications (3)
Publication Number | Publication Date |
---|---|
JPWO2022201352A1 JPWO2022201352A1 (en) | 2022-09-29 |
JPWO2022201352A5 true JPWO2022201352A5 (en) | 2023-06-28 |
JP7350214B2 JP7350214B2 (en) | 2023-09-25 |
Family
ID=83396538
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2023508251A Active JP7350214B2 (en) | 2021-03-24 | 2021-03-24 | Inference device, inference method, and inference program |
Country Status (3)
Country | Link |
---|---|
JP (1) | JP7350214B2 (en) |
TW (1) | TW202238458A (en) |
WO (1) | WO2022201352A1 (en) |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10552119B2 (en) | 2016-04-29 | 2020-02-04 | Intel Corporation | Dynamic management of numerical representation in a distributed matrix processor architecture |
JP7045947B2 (en) * | 2018-07-05 | 2022-04-01 | 株式会社日立製作所 | Neural network learning device and learning method |
KR20200043169A (en) * | 2018-10-17 | 2020-04-27 | 삼성전자주식회사 | Method and apparatus for quantizing neural network parameters |
-
2021
- 2021-03-24 WO PCT/JP2021/012193 patent/WO2022201352A1/en active Application Filing
- 2021-03-24 JP JP2023508251A patent/JP7350214B2/en active Active
- 2021-08-18 TW TW110130424A patent/TW202238458A/en unknown
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP2020534456A5 (en) | ||
RU2017119853A (en) | COMPLEXITY OF LOCALIZATION OF ARBITRARY LANGUAGE MATERIALS AND RESOURCES | |
JP2019511033A5 (en) | ||
JP2006246435A5 (en) | ||
CN111078546B (en) | Page feature expression method and electronic equipment | |
CN112100386B (en) | Method for determining target type app, electronic device and medium | |
JP2023052161A5 (en) | MODEL CREATION DEVICE, MODEL CREATION METHOD AND COMPUTER PROGRAM | |
CN103678598A (en) | Circuit board accurate detecting method for built-in standard establishment based on Gerber file | |
WO2020231007A3 (en) | Medical equipment learning system | |
CN106875345A (en) | Non-local TV model image denoising method based on singular value weight function | |
CN113742205B (en) | Code vulnerability intelligent detection method based on man-machine cooperation | |
JP2022160590A (en) | Method and device for determining pre-trained model, electronic device, and storage medium | |
JP2023025126A (en) | Training method and apparatus for deep learning model, text data processing method and apparatus, electronic device, storage medium, and computer program | |
JP2020119154A5 (en) | ||
CN116361191A (en) | Software compatibility processing method based on artificial intelligence | |
JPWO2022201352A5 (en) | ||
CN114495101A (en) | Text detection method, and training method and device of text detection network | |
KR20200053170A (en) | Method for setting artificial intelligence execution model and system for acceleration a.i execution | |
CN117216564A (en) | Large model automatic fine tuning training method and equipment for small domain vertical expert | |
ZA202308288B (en) | Method and system for unsupervised deep representation learning based on image translation | |
JPWO2021024476A5 (en) | Software analysis equipment, software analysis methods and programs | |
CN109143851B (en) | Method for recognizing multi-mark fault deep learning and intelligently expressing result thereof | |
CN116340777A (en) | Training method of log classification model, log classification method and device | |
CN112380328B (en) | Interaction method and system for safety emergency response robot | |
JPWO2021181676A5 (en) | Information processing device, control method and program |