JPWO2022201352A5 - - Google Patents

Download PDF

Info

Publication number
JPWO2022201352A5
JPWO2022201352A5 JP2023508251A JP2023508251A JPWO2022201352A5 JP WO2022201352 A5 JPWO2022201352 A5 JP WO2022201352A5 JP 2023508251 A JP2023508251 A JP 2023508251A JP 2023508251 A JP2023508251 A JP 2023508251A JP WO2022201352 A5 JPWO2022201352 A5 JP WO2022201352A5
Authority
JP
Japan
Prior art keywords
quantization
feature data
inference
quantized
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2023508251A
Other languages
Japanese (ja)
Other versions
JP7350214B2 (en
JPWO2022201352A1 (en
Filing date
Publication date
Application filed filed Critical
Priority claimed from PCT/JP2021/012193 external-priority patent/WO2022201352A1/en
Publication of JPWO2022201352A1 publication Critical patent/JPWO2022201352A1/ja
Publication of JPWO2022201352A5 publication Critical patent/JPWO2022201352A5/ja
Application granted granted Critical
Publication of JP7350214B2 publication Critical patent/JP7350214B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Description

本開示に係る推論装置は、
推論用データを用いて機械学習の手法に基づいた少なくとも1回の量子化演算を実行する量子化推論部と、
前記推論用データを用いて前記少なくとも1回の量子化演算それぞれに対応する少なくとも1回の非量子化演算の少なくともいずれかを実行する非量子化推論部と
を備える推論装置であって、
前記少なくとも1回の量子化演算それぞれは、前記少なくとも1回の量子化演算それぞれの特徴を示す少なくとも1つの量子化特徴データそれぞれに応じた演算であり、
前記少なくとも1回の非量子化演算それぞれは、前記少なくとも1回の非量子化演算それぞれの特徴を示す少なくとも1つの非量子化特徴データそれぞれに応じた演算であり、
前記推論装置は、さらに、
前記少なくとも1回の量子化演算の少なくともいずれかにおいてオーバーフローが発生した場合に、オーバーフローが発生した量子化演算の各々に対応する量子化特徴データと、オーバーフローが発生した量子化演算の各々に対応する非量子化演算に対応する非量子化特徴データとを抽出するデータ抽出部
を備える。
The reasoning device according to the present disclosure is
a quantization inference unit that uses the inference data to perform at least one quantization operation based on machine learning techniques;
a non-quantization inference unit that performs at least one of at least one non-quantization operation corresponding to each of the at least one quantization operation using the inference data, the inference device comprising:
each of the at least one quantization operation is an operation corresponding to each of at least one quantization feature data indicating a feature of each of the at least one quantization operation;
each of the at least one unquantized operation is an operation corresponding to each of at least one piece of unquantized feature data representing a feature of each of the at least one unquantized operation;
The reasoning device further
When overflow occurs in at least one of the at least one quantization operations, quantization feature data corresponding to each of the quantization operations in which overflow has occurred, and quantization feature data corresponding to each of the quantization operations in which overflow has occurred. A data extraction unit for extracting non-quantized feature data corresponding to the non-quantized operation.

本開示によれば、機械学習の手法に基づいた推論においてオーバーフローが発生した場合に、データ抽出部が、発生したオーバーフローに関係のある量子化特徴データと非量子化特徴データとを抽出する。ここで、量子化特徴データと非量子化特徴データとの各々は推論用データとは異なるデータである。そのため、本開示によれば、ある推論用データを用いて推論を実行した際にオーバーフローが発生した場合において、発生したオーバーフローを解析するためのデータであって当該ある推論用データとは異なるデータを取得することができる。 According to the present disclosure, when an overflow occurs in inference based on a machine learning technique , the data extraction unit extracts quantized feature data and non-quantized feature data related to the overflow that has occurred. . Here, each of the quantized feature data and the non-quantized feature data is data different from the inference data. Therefore, according to the present disclosure, when an overflow occurs when inference is executed using certain inference data, data for analyzing the overflow that has occurred and is different from the inference data can be obtained.

データ抽出部130は、量子化推論プロセスにおいてオーバーフローが発生した場合に、量子化推論プロセスと非量子化推論プロセスとから退避データを抽出する。退避データは、量子化推論プロセスにおけるオーバーフローを解析する際に用いられるデータであり、具体例として、入力データと特徴データとから成る。入力データは、少なくとも1回の量子化演算の各々と少なくとも1回の非量子化演算の各々とを実行する際に入力されるデータである。特徴データは、量子化演算において活用されるデータである。特徴データは、量子化特徴データと非量子化特徴データとの総称であり、少なくとも1回の量子化演算と少なくとも1回の非量子化演算との各々の演算ごとに存在し、具体例として演算における入力データの振れ幅を表す。入力データの振れ幅は、入力データとして想定されるデータが示す値の最小値から最大値までの範囲である。データ抽出部130は、少なくとも1回の量子化演算の少なくともいずれかにおいてオーバーフローが発生した場合に、オーバーフローが発生した量子化演算の各々に対応する量子化特徴データと、オーバーフローが発生した量子化演算の各々に対応する非量子化演算に対応する非量子化特徴データとを抽出する。
機械学習の手法がディープラーニングである場合において、データ抽出部130は、量子化特徴データとしてオーバーフローが発生した量子化演算に対応するレイヤについてのパラメータを示すデータを抽出してもよく、非量子化特徴データとしてオーバーフローが発生した量子化演算に対応する非量子化演算に対応するレイヤについてのパラメータを示すデータを抽出してもよい。
The data extraction unit 130 extracts saved data from the quantization inference process and the non-quantization inference process when an overflow occurs in the quantization inference process. The saved data is data used when analyzing overflow in the quantization inference process, and as a specific example, consists of input data and feature data. Input data is data that is input in performing each of the at least one quantized operation and each of the at least one unquantized operation. Feature data is data utilized in the quantization operation. Feature data is a general term for quantized feature data and non-quantized feature data, and exists for each of at least one quantized operation and at least one non-quantized operation. represents the amplitude of the input data at The amplitude of the input data is the range from the minimum value to the maximum value indicated by the data assumed as the input data. When an overflow occurs in at least one of at least one quantization operation, the data extracting unit 130 extracts quantization feature data corresponding to each of the quantization operations in which the overflow occurs, and the quantization operation in which the overflow occurs. and unquantized feature data corresponding to unquantized operations corresponding to each of .
When the machine learning technique is deep learning , the data extraction unit 130 may extract, as the quantization feature data, data indicating parameters for the layer corresponding to the quantization operation in which the overflow occurred. As the quantization feature data, data representing parameters for a layer corresponding to a non-quantization operation corresponding to a quantization operation in which overflow has occurred may be extracted.

Claims (6)

推論用データを用いて機械学習の手法に基づいた少なくとも1回の量子化演算を実行する量子化推論部と、
前記推論用データを用いて前記少なくとも1回の量子化演算それぞれに対応する少なくとも1回の非量子化演算の少なくともいずれかを実行する非量子化推論部と
を備える推論装置であって、
前記少なくとも1回の量子化演算それぞれは、前記少なくとも1回の量子化演算それぞれの特徴を示す少なくとも1つの量子化特徴データそれぞれに応じた演算であり、
前記少なくとも1回の非量子化演算それぞれは、前記少なくとも1回の非量子化演算それぞれの特徴を示す少なくとも1つの非量子化特徴データそれぞれに応じた演算であり、
前記推論装置は、さらに、
前記少なくとも1回の量子化演算の少なくともいずれかにおいてオーバーフローが発生した場合に、オーバーフローが発生した量子化演算の各々に対応する量子化特徴データと、オーバーフローが発生した量子化演算の各々に対応する非量子化演算に対応する非量子化特徴データとを抽出するデータ抽出部
を備える推論装置。
a quantization inference unit that uses the inference data to perform at least one quantization operation based on machine learning techniques;
a non-quantization inference unit that performs at least one of at least one non-quantization operation corresponding to each of the at least one quantization operation using the inference data, the inference device comprising:
each of the at least one quantization operation is an operation corresponding to each of at least one quantization feature data indicating a feature of each of the at least one quantization operation;
each of the at least one unquantized operation is an operation corresponding to each of at least one piece of unquantized feature data representing a feature of each of the at least one unquantized operation;
The reasoning device further
When overflow occurs in at least one of the at least one quantization operations, quantization feature data corresponding to each of the quantization operations in which overflow has occurred, and quantization feature data corresponding to each of the quantization operations in which overflow has occurred. An inference apparatus comprising a data extraction unit for extracting non-quantized feature data corresponding to non-quantized operations.
前記少なくとも1つの量子化特徴データそれぞれは、前記少なくとも1回の量子化演算それぞれに対応するパラメータを含み、
前記少なくとも1つの非量子化特徴データそれぞれは、前記少なくとも1回の非量子化演算それぞれに対応するパラメータを含む請求項1に記載の推論装置。
each of the at least one quantized feature data includes parameters corresponding to each of the at least one quantization operation;
2. The reasoning apparatus of claim 1, wherein each of the at least one unquantized feature data includes parameters corresponding to each of the at least one unquantized operation.
前記機械学習の手法は、ディープラーニングであり、
記データ抽出部は、
前記量子化特徴データとして、オーバーフローが発生した量子化演算に対応するレイヤについてのパラメータを示すデータを抽出し、
前記非量子化特徴データとして、オーバーフローが発生した量子化演算に対応する非量子化演算に対応するレイヤについてのパラメータを示すデータを抽出する請求項1又は2に記載の推論装置。
The machine learning method is deep learning,
The data extractor is
extracting, as the quantization feature data, data indicating a parameter for a layer corresponding to a quantization operation in which an overflow has occurred;
3. The reasoning apparatus according to claim 1, wherein, as said non-quantized feature data, data indicating parameters for a layer corresponding to a non-quantized operation corresponding to a quantized operation in which an overflow has occurred is extracted.
前記推論装置は、さらに、
抽出された量子化特徴データと非量子化特徴データとに基づいて、抽出された量子化特徴データに対応する量子化演算である対象演算においてオーバーフローが発生しないよう、前記対象演算に対応する量子化特徴データを変更する再量子化部を
備え、
前記量子化推論部は、変更された量子化特徴データに応じた量子化演算を実行する請求項1から3のいずれか1項に記載の推論装置。
The reasoning device further
quantization corresponding to the target operation, which is the quantization operation corresponding to the extracted quantized feature data, based on the extracted quantized feature data and the non-quantized feature data, so that overflow does not occur in the target operation; a requantizer that modifies the feature data,
4. The reasoning apparatus according to any one of claims 1 to 3, wherein the quantization reasoning unit executes a quantization operation according to changed quantization feature data.
コンピュータが、推論用データを用いて機械学習の手法に基づいた少なくとも1回の量子化演算を実行し、
前記コンピュータが、前記推論用データを用いて前記少なくとも1回の量子化演算それぞれに対応する少なくとも1回の非量子化演算の少なくともいずれかを実行する推論方法であって、
前記少なくとも1回の量子化演算それぞれは、前記少なくとも1回の量子化演算それぞれの特徴を示す少なくとも1つの量子化特徴データそれぞれに応じた演算であり、
前記少なくとも1回の非量子化演算それぞれは、前記少なくとも1回の非量子化演算それぞれの特徴を示す少なくとも1つの非量子化特徴データそれぞれに応じた演算であり、
前記コンピュータが、前記少なくとも1回の量子化演算の少なくともいずれかにおいてオーバーフローが発生した場合に、オーバーフローが発生した量子化演算の各々に対応する量子化特徴データと、オーバーフローが発生した量子化演算の各々に対応する非量子化演算に対応する非量子化特徴データとを抽出する推論方法。
a computer performs at least one quantization operation based on machine learning techniques using the inference data;
an inference method in which the computer performs at least one unquantized operation corresponding to each of the at least one quantized operation using the inference data,
each of the at least one quantization operation is an operation corresponding to each of at least one quantization feature data indicating a feature of each of the at least one quantization operation;
each of the at least one unquantized operation is an operation corresponding to each of at least one piece of unquantized feature data representing a feature of each of the at least one unquantized operation;
When an overflow occurs in at least one of the at least one quantization operations, the computer provides quantization feature data corresponding to each of the quantization operations in which the overflow occurred, and quantization feature data corresponding to each of the quantization operations in which the overflow occurred. An inference method for extracting unquantized feature data corresponding to each corresponding unquantized operation.
推論用データを用いて機械学習の手法に基づいた少なくとも1回の量子化演算を実行する量子化推論処理と、
前記推論用データを用いて前記少なくとも1回の量子化演算それぞれに対応する少なくとも1回の非量子化演算の少なくともいずれかを実行する非量子化推論処理と
をコンピュータである推論装置に実行させる推論プログラムであって、
前記少なくとも1回の量子化演算それぞれは、前記少なくとも1回の量子化演算それぞれの特徴を示す少なくとも1つの量子化特徴データそれぞれに応じた演算であり、
前記少なくとも1回の非量子化演算それぞれは、前記少なくとも1回の非量子化演算それぞれの特徴を示す少なくとも1つの非量子化特徴データそれぞれに応じた演算であり、
前記推論プログラムは、さらに、
前記少なくとも1回の量子化演算の少なくともいずれかにおいてオーバーフローが発生した場合に、オーバーフローが発生した量子化演算の各々に対応する量子化特徴データと、オーバーフローが発生した量子化演算の各々に対応する非量子化演算に対応する非量子化特徴データとを抽出するデータ抽出処理を前記推論装置に実行させる推論プログラム。
quantization inference processing that performs at least one quantization operation based on machine learning techniques using inference data;
making an inference device, which is a computer, perform non-quantized inference processing for executing at least one of at least one non-quantized operation corresponding to each of the at least one quantized operation using the inference data; a program,
each of the at least one quantization operation is an operation corresponding to each of at least one quantization feature data indicating a feature of each of the at least one quantization operation;
each of the at least one unquantized operation is an operation corresponding to each of at least one piece of unquantized feature data representing a feature of each of the at least one unquantized operation;
The inference program further
When overflow occurs in at least one of the at least one quantization operations, quantization feature data corresponding to each of the quantization operations in which overflow has occurred, and quantization feature data corresponding to each of the quantization operations in which overflow has occurred. An inference program that causes the inference device to execute data extraction processing for extracting non-quantized feature data corresponding to non-quantized operations.
JP2023508251A 2021-03-24 2021-03-24 Inference device, inference method, and inference program Active JP7350214B2 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/012193 WO2022201352A1 (en) 2021-03-24 2021-03-24 Inference device, inference method, and inference program

Publications (3)

Publication Number Publication Date
JPWO2022201352A1 JPWO2022201352A1 (en) 2022-09-29
JPWO2022201352A5 true JPWO2022201352A5 (en) 2023-06-28
JP7350214B2 JP7350214B2 (en) 2023-09-25

Family

ID=83396538

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2023508251A Active JP7350214B2 (en) 2021-03-24 2021-03-24 Inference device, inference method, and inference program

Country Status (3)

Country Link
JP (1) JP7350214B2 (en)
TW (1) TW202238458A (en)
WO (1) WO2022201352A1 (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10552119B2 (en) 2016-04-29 2020-02-04 Intel Corporation Dynamic management of numerical representation in a distributed matrix processor architecture
JP7045947B2 (en) * 2018-07-05 2022-04-01 株式会社日立製作所 Neural network learning device and learning method
KR20200043169A (en) * 2018-10-17 2020-04-27 삼성전자주식회사 Method and apparatus for quantizing neural network parameters

Similar Documents

Publication Publication Date Title
JP2020534456A5 (en)
RU2017119853A (en) COMPLEXITY OF LOCALIZATION OF ARBITRARY LANGUAGE MATERIALS AND RESOURCES
JP2019511033A5 (en)
JP2006246435A5 (en)
CN111078546B (en) Page feature expression method and electronic equipment
CN112100386B (en) Method for determining target type app, electronic device and medium
JP2023052161A5 (en) MODEL CREATION DEVICE, MODEL CREATION METHOD AND COMPUTER PROGRAM
CN103678598A (en) Circuit board accurate detecting method for built-in standard establishment based on Gerber file
WO2020231007A3 (en) Medical equipment learning system
CN106875345A (en) Non-local TV model image denoising method based on singular value weight function
CN113742205B (en) Code vulnerability intelligent detection method based on man-machine cooperation
JP2022160590A (en) Method and device for determining pre-trained model, electronic device, and storage medium
JP2023025126A (en) Training method and apparatus for deep learning model, text data processing method and apparatus, electronic device, storage medium, and computer program
JP2020119154A5 (en)
CN116361191A (en) Software compatibility processing method based on artificial intelligence
JPWO2022201352A5 (en)
CN114495101A (en) Text detection method, and training method and device of text detection network
KR20200053170A (en) Method for setting artificial intelligence execution model and system for acceleration a.i execution
CN117216564A (en) Large model automatic fine tuning training method and equipment for small domain vertical expert
ZA202308288B (en) Method and system for unsupervised deep representation learning based on image translation
JPWO2021024476A5 (en) Software analysis equipment, software analysis methods and programs
CN109143851B (en) Method for recognizing multi-mark fault deep learning and intelligently expressing result thereof
CN116340777A (en) Training method of log classification model, log classification method and device
CN112380328B (en) Interaction method and system for safety emergency response robot
JPWO2021181676A5 (en) Information processing device, control method and program