CN116048456A

CN116048456A - Matrix multiplier, method of matrix multiplication, and computing device

Info

Publication number: CN116048456A
Application number: CN202310344718.1A
Authority: CN
Inventors: 请求不公布姓名
Original assignee: Moore Threads Technology Co Ltd
Current assignee: Moore Threads Technology Co Ltd
Priority date: 2023-04-03
Filing date: 2023-04-03
Publication date: 2023-05-02

Abstract

A matrix multiplier, a method of matrix multiplication, and a computing device, the matrix multiplier comprising: the device comprises a comparison circuit and a first operation circuit, wherein the comparison circuit is used for determining whether first target data and/or second target data are data in a first set, the first operation circuit is used for outputting a first result of multiplying the first target data and the second target data according to the fact that the first target data and/or the second target data are data in the first set, and the first set comprises: 0. 2 ⁿ N is an integer, and the first result includes: 0 or third data, wherein the third data is obtained by shifting the first target data or the second target data according to n pairs, or is obtained by shifting and inverting the first target data or the second target data according to n pairs. The matrix multiplier can save power consumption.

Description

Matrix multiplier, method of matrix multiplication and computing device

技术领域technical field

本申请涉及芯片设计领域，并且更具体地，涉及一种矩阵乘法器、矩阵相乘的方法以及计算设备。The present application relates to the field of chip design, and more particularly, to a matrix multiplier, a matrix multiplication method, and a computing device.

背景技术Background technique

矩阵乘法（matric multiplication，MM）是神经网络、机器学习等现代人工智能相关技术中很重要的数学运算之一。一个示例，可以通过矩阵乘法器进行矩阵乘法的操作。Matrix multiplication (matric multiplication, MM) is one of the most important mathematical operations in modern artificial intelligence related technologies such as neural networks and machine learning. As an example, the operation of matrix multiplication can be performed by a matrix multiplier.

相关的技术方案中，由于矩阵乘法中涉及许多的乘法运算和加法运算，因此，该相关技术方案中的矩阵乘法器包括常规的乘法电路，该常规的乘法电路中需要包括多个加法器、移位器以及多个乘法器。在进行矩阵乘法操作时，该常规的乘法电路由于包括多个加法器、移位器以及多个乘法器，其具有较高的功耗。In the related technical solution, since many multiplication operations and addition operations are involved in the matrix multiplication, the matrix multiplier in the related technical solution includes a conventional multiplication circuit, which needs to include multiple adders, shifters, bit registers and multiple multipliers. When performing matrix multiplication, the conventional multiplication circuit has relatively high power consumption because it includes multiple adders, shifters, and multiple multipliers.

因此，如何减少矩阵乘法器的功耗成为亟需要解决的技术问题。Therefore, how to reduce the power consumption of the matrix multiplier has become an urgent technical problem to be solved.

发明内容Contents of the invention

本申请提供一种矩阵乘法器、矩阵相乘的方法以及计算设备，该矩阵乘法器能够节省功耗。The present application provides a matrix multiplier, a matrix multiplication method and a computing device, and the matrix multiplier can save power consumption.

第一方面，提供了一种矩阵乘法器，该矩阵乘法器包括：比较电路，第一操作电路。In a first aspect, a matrix multiplier is provided, and the matrix multiplier includes: a comparison circuit, and a first operation circuit.

比较电路，用于确定第一目标数据和/或第二目标数据是否为第一集合中的数据，其中，该第一目标数据为第一矩阵中的数据，该第二目标数据为第二矩阵中的数据，该第一集合包括：0、2ⁿ，该n为整数；A comparison circuit for determining whether the first target data and/or the second target data are data in the first set, wherein the first target data is data in the first matrix, and the second target data is the second matrix In the data, the first set includes: 0, 2 ⁿ , where n is an integer;

第一操作电路，用于根据该第一目标数据和/或该第二目标数据为该第一集合中的数据，输出该第一目标数据和该第二目标数据相乘的第一结果，该第一结果包括：0或第三数据，其中，第三数据是根据n对第一目标数据或第二目标数据移位得到的，或是根据n对第一目标数据或第二目标数据移位并取反得到的。A first operation circuit, configured to output a first result of multiplying the first target data and the second target data according to the first target data and/or the second target data being data in the first set, the The first result includes: 0 or the third data, wherein the third data is shifted according to n pairs of the first target data or the second target data, or shifted according to n pairs of the first target data or the second target data And negate the obtained.

上述矩阵乘法器可以在对两个矩阵中的数据进行乘法操作时，对于特殊的数据（例如，该数据为0或±2ⁿ），在对该数据进行乘法运算时，可以直接输出对应的特殊的结果。这样，可以减少由于进行常规的乘法运算所带来的较高的功耗。并且，由于矩阵乘法器是对两个矩阵中的数据进行乘法操作，会涉及大量的操作运算，功耗的节省所带来的收益更为可观。The above matrix multiplier can directly output the corresponding special data (for example, the data is 0 or ±2 ⁿ ) when multiplying the data in the two matrices. the result of. In this way, high power consumption caused by conventional multiplication operations can be reduced. Moreover, since the matrix multiplier performs multiplication operations on data in two matrices, a large number of operations are involved, and the benefits brought about by saving power consumption are even more considerable.

结合第一方面，在第一方面的某些实现方式中，该第一操作电路，具体用于根据该第一目标数据为0，输出的该第一结果为0。With reference to the first aspect, in some implementation manners of the first aspect, the first operation circuit is specifically configured to output the first result as 0 according to the first target data being 0.

结合第一方面，在第一方面的某些实现方式中，该第一操作电路，具体用于根据该第一目标数据为2ⁿ，输出的该第一结果为该第三数据，该第三数据是对该第二目标数据左移或右移|n|位得到的。With reference to the first aspect, in some implementations of the first aspect, the first operation circuit is specifically configured to output the first result as the third data according to the first target data being 2 ⁿ , and the third The data is obtained by shifting left or right |n| bits of the second target data.

结合第一方面，在第一方面的某些实现方式中，该第一操作电路具体用于：根据该第一目标数据为2ⁿ以及该n为正整数，输出的该第一结果为该第三数据，该第三数据是对该第二目标数据左移n位得到的；或根据该第一目标数据为2ⁿ以及该n为负整数，输出的该第一结果为该第三数据，该第三数据是对该第二目标数据右移|n|位得到的。With reference to the first aspect, in some implementations of the first aspect, the first operation circuit is specifically configured to: according to the first target data being 2 ⁿ and the n being a positive integer, the first output result is the first Three data, the third data is obtained by shifting the second target data to the left by n bits; or according to the first target data being 2 ⁿ and the n being a negative integer, the first output result is the third data, The third data is obtained by right shifting |n| bits of the second target data.

结合第一方面，在第一方面的某些实现方式中，所述第一操作电路，具体用于根据所述第一目标数据为-2ⁿ，输出的所述第一结果为所述第三数据，第三数据是对第二目标数据左移或右移|n|位并取反得到的。With reference to the first aspect, in some implementation manners of the first aspect, the first operation circuit is specifically configured to output the first result according to the first target data being -2 ⁿ as the third Data, the third data is obtained by shifting the second target data left or right by |n| bits and inverting it.

结合第一方面，在第一方面的某些实现方式中，该比较电路，还用于根据该第一目标数据和/或该第二目标数据为该第一集合中的数据，确定第一操作码，其中，该第一操作码的取值指示该第一目标数据和/或该第二目标数据为0或±2ⁿ；该第一操作电路，具体用于根据该第一操作码的取值以及该第一目标数据和/或该第二目标数据，确定该第一结果为0或该第三数据。With reference to the first aspect, in some implementations of the first aspect, the comparison circuit is further configured to determine the first operation according to the first target data and/or the second target data as data in the first set code, wherein the value of the first operation code indicates that the first target data and/or the second target data is 0 or ±2 ⁿ ; the first operation circuit is specifically used for taking the first operation code according to value and the first target data and/or the second target data, determine that the first result is 0 or the third data.

结合第一方面，在第一方面的某些实现方式中，该矩阵乘法器还包括至少一个第一寄存器，该至少一个第一寄存器分别与该比较电路、该第一操作电路连接，该至少一个第一寄存器，用于从该比较电路获取该第一目标数据和该第二目标数据；该至少一个第一寄存器，还用于将该第一目标数据和该第二目标数据输出至该第一操作电路。With reference to the first aspect, in some implementations of the first aspect, the matrix multiplier further includes at least one first register, the at least one first register is respectively connected to the comparison circuit and the first operation circuit, and the at least one The first register is used to obtain the first target data and the second target data from the comparison circuit; the at least one first register is also used to output the first target data and the second target data to the first operating circuit.

结合第一方面，在第一方面的某些实现方式中，该至少一个第一寄存器，还用于从该比较电路获取该第一操作码，并将该第一操作码输出至该第一操作电路。With reference to the first aspect, in some implementations of the first aspect, the at least one first register is also used to obtain the first operation code from the comparison circuit, and output the first operation code to the first operation code circuit.

结合第一方面，在第一方面的某些实现方式中，该矩阵乘法器还包括至少一个第二寄存器和第二操作电路，该至少一个第二寄存器和该第二操作电路连接，该至少一个第二寄存器，用于在该比较电路确定该第一目标数据和该第二目标数据均不是该第一集合中的数据的情况下，将获取的该第一目标数据和该第二目标数据输出至该第二操作电路；该第二操作电路，用于对接收到的该第一目标数据和该第二目标数据进行常规的乘法运算，输出该第一目标数据和该第二目标数据相乘的第二结果。With reference to the first aspect, in some implementations of the first aspect, the matrix multiplier further includes at least one second register and a second operating circuit, the at least one second register is connected to the second operating circuit, and the at least one The second register is used to output the obtained first target data and the second target data when the comparison circuit determines that the first target data and the second target data are not data in the first set To the second operation circuit; the second operation circuit is used to perform a conventional multiplication operation on the received first target data and the second target data, and output the multiplication of the first target data and the second target data the second result of .

结合第一方面，在第一方面的某些实现方式中，该矩阵乘法器还包括数据选择器MUX，该MUX分别与该第二操作电路、该第一操作电路连接，该MUX，用于将该第一结果或该第二结果作为该矩阵乘法器的输出。With reference to the first aspect, in some implementations of the first aspect, the matrix multiplier further includes a data selector MUX, the MUX is respectively connected to the second operating circuit and the first operating circuit, and the MUX is used to select The first result or the second result is used as the output of the matrix multiplier.

结合第一方面，在第一方面的某些实现方式中，该比较电路，还用于在该比较电路确定该第一目标数据和/或该第二目标数据为该第一集合中的数据的情况下，向该至少一个第一寄存器输出取值为1的使能信号；该至少一个第一寄存器，具体用于根据该取值为1的使能信号，将该第一目标数据和该第二目标数据输出至该第一操作电路。With reference to the first aspect, in some implementation manners of the first aspect, the comparison circuit is further configured to, when the comparison circuit determines that the first target data and/or the second target data are data in the first set In this case, an enable signal with a value of 1 is output to the at least one first register; the at least one first register is specifically used to combine the first target data with the first target data according to the enable signal with a value of 1. The two target data are output to the first operation circuit.

结合第一方面，在第一方面的某些实现方式中，该矩阵乘法器还包括取反电路，该取反电路分别与该比较电路和该至少一个第二寄存器连接，该比较电路，还用于在该比较电路确定该第一目标数据和该第二目标数据均不是该第一集合中的数据的情况下，向该至少一个第一寄存器输出取值为0的使能信号；该取反电路，用于对该比较电路输出的该取值为0的使能信号进行取反操作，得到取值为1的使能信号，并将该取值为1的使能信号输出至该至少一个第二寄存器；该至少一个第二寄存器，具体用于基于该取值为1的使能信号，将获取的该第一目标数据和该第二目标数据输出至该第二操作电路。With reference to the first aspect, in some implementations of the first aspect, the matrix multiplier further includes an inversion circuit, and the inversion circuit is respectively connected to the comparison circuit and the at least one second register, and the comparison circuit is also used When the comparison circuit determines that the first target data and the second target data are not data in the first set, output an enable signal with a value of 0 to the at least one first register; the inversion A circuit, configured to perform an inversion operation on the enable signal with a value of 0 output by the comparison circuit to obtain an enable signal with a value of 1, and output the enable signal with a value of 1 to the at least one second register; the at least one second register is specifically configured to output the acquired first target data and the second target data to the second operating circuit based on the enable signal with a value of 1.

结合第一方面，在第一方面的某些实现方式中，该取反电路为非门。With reference to the first aspect, in some implementation manners of the first aspect, the negation circuit is a NOT gate.

结合第一方面，在第一方面的某些实现方式中，该MUX具体用于：在接收到取值为1的使能信号后，将该第一结果作为该矩阵乘法器的输出；或在接收到取值为0的使能信号后，将该第二结果作为该矩阵乘法器的输出。With reference to the first aspect, in some implementations of the first aspect, the MUX is specifically configured to: after receiving an enable signal with a value of 1, use the first result as the output of the matrix multiplier; or After receiving the enable signal whose value is 0, the second result is used as the output of the matrix multiplier.

第二方面，提供了一种矩阵相乘的方法，该方法应用于矩阵乘法器，该矩阵乘法器用于对第一矩阵和第二矩阵进行矩阵乘的操作，该方法包括：比较电路确定第一目标数据和/或第二目标数据是否为第一集合中的数据，其中，该第一目标数据为该第一矩阵中的数据，该第二目标数据为该第二矩阵中的数据，该第一集合包括：0、±2ⁿ，该n为整数；第一操作电路根据该第一目标数据和/或第二目标数据为该第一集合中的数据，输出该第一目标数据和该第二目标数据相乘的第一结果，该第一结果包括：0或第三数据，其中，第三数据是根据n对第一目标数据或第二目标数据移位得到的，或是根据n对第一目标数据或第二目标数据移位并取反得到的。In a second aspect, a matrix multiplication method is provided, the method is applied to a matrix multiplier, and the matrix multiplier is used to perform matrix multiplication operations on the first matrix and the second matrix, and the method includes: the comparison circuit determines the first whether the target data and/or the second target data are data in the first set, wherein the first target data is the data in the first matrix, the second target data is the data in the second matrix, and the first A set includes: 0, ±2 ⁿ , where n is an integer; the first operating circuit outputs the first target data and the second target data according to the first target data and/or the second target data as data in the first set The first result of the multiplication of two target data, the first result includes: 0 or the third data, wherein the third data is shifted according to n pairs of the first target data or the second target data, or obtained according to n pairs of Obtained by shifting and inverting the first target data or the second target data.

结合第二方面，在第二方面的某些实现方式中，该第一操作电路根据该第一目标数据为0，输出的该第一结果为0。With reference to the second aspect, in some implementation manners of the second aspect, the first operation circuit outputs the first result as 0 according to the first target data as 0.

结合第二方面，在第二方面的某些实现方式中，该第一操作电路根据该第一目标数据为2ⁿ，输出的该第一结果为该第三数据，该第三数据是对该第二目标数据左移或右移|n|位得到的。With reference to the second aspect, in some implementations of the second aspect, the first operation circuit outputs the first result as the third data according to the first target data being 2 ⁿ , and the third data is for the The second target data is obtained by shifting left or right by |n| bits.

结合第二方面，在第二方面的某些实现方式中，该第一操作电路根据该第一目标数据为2ⁿ以及该n为正整数，输出的该第一结果为该第三数据，该第三数据是对该第二目标数据左移n位得到的；或该第一操作电路根据该第一目标数据为2ⁿ以及该n为负整数，输出的该第一结果为该第三数据，该第三数据为对该第二目标数据右移|n|位得到的。With reference to the second aspect, in some implementations of the second aspect, the first operating circuit outputs the first result as the third data according to the first target data being 2 ⁿ and the n being a positive integer, and the The third data is obtained by shifting the second target data to the left by n bits; or the first operation circuit outputs the first result as the third data according to the first target data being 2 ⁿ and the n being a negative integer , the third data is obtained by right shifting |n| bits of the second target data.

结合第二方面，在第二方面的某些实现方式中，该第一操作电路根据第一目标数据为-2ⁿ，输出的第一结果为所述第三数据，第三数据是对第二目标数据左移或右移|n|位并取反得到的。With reference to the second aspect, in some implementations of the second aspect, the first operation circuit outputs the first result according to the first target data being -2 ⁿ , and the third data is the third data for the second The target data is obtained by shifting left or right |n| bits and inverting.

结合第二方面，在第二方面的某些实现方式中，该方法还包括：该比较电路根据该第一目标数据和/或该第二目标数据为该第一集合中的数据，确定第一操作码，其中，该第一操作码的取值指示该第一目标数据和/或该第二目标数据为0或±2ⁿ；该第一操作电路根据该第一操作码的取值以及该第一目标数据和/或该第二目标数据，确定该第一结果为0或该第三数据。With reference to the second aspect, in some implementations of the second aspect, the method further includes: the comparison circuit determines the first target data and/or the second target data as data in the first set, operation code, wherein the value of the first operation code indicates that the first target data and/or the second target data is 0 or ±2 ⁿ ; the first operation circuit according to the value of the first operation code and the The first target data and/or the second target data, the first result of determining is 0 or the third data.

结合第二方面，在第二方面的某些实现方式中，该方法还包括：至少一个第一寄存器从该比较电路获取该第一目标数据和该第二目标数据，该至少一个第一寄存器分别与该比较电路、该第一操作电路连接；该至少一个第一寄存器将该第一目标数据和该第二目标数据输出至该第一操作电路。With reference to the second aspect, in some implementations of the second aspect, the method further includes: at least one first register acquires the first target data and the second target data from the comparison circuit, and the at least one first register respectively It is connected with the comparison circuit and the first operation circuit; the at least one first register outputs the first target data and the second target data to the first operation circuit.

结合第二方面，在第二方面的某些实现方式中，该方法还包括：该至少一个第一寄存器从该比较电路获取该第一操作码，并将该第一操作码输出至该第一操作电路。With reference to the second aspect, in some implementations of the second aspect, the method further includes: the at least one first register acquires the first operation code from the comparison circuit, and outputs the first operation code to the first operating circuit.

结合第二方面，在第二方面的某些实现方式中，该方法还包括：至少一个第二寄存器在该比较电路确定该第一目标数据和该第二目标数据均不是该第一集合中的数据的情况下，将获取的该第一目标数据和该第二目标数据输出至第二操作电路，该至少一个第二寄存器和该第二操作电路连接；该第二操作电路对接收到的该第一目标数据和该第二目标数据进行常规的乘法运算，输出该第一目标数据和该第二目标数据相乘的第二结果。With reference to the second aspect, in some implementations of the second aspect, the method further includes: at least one second register, when the comparison circuit determines that neither the first target data nor the second target data is in the first set In the case of data, output the obtained first target data and the second target data to the second operation circuit, and the at least one second register is connected to the second operation circuit; A conventional multiplication operation is performed on the first target data and the second target data, and a second result of multiplying the first target data and the second target data is output.

结合第二方面，在第二方面的某些实现方式中，该方法还包括：数据选择器MUX将该第一结果或该第二结果作为该矩阵乘法器的输出，该UX分别与该第二操作电路、该第一操作电路连接。With reference to the second aspect, in some implementations of the second aspect, the method further includes: the data selector MUX takes the first result or the second result as an output of the matrix multiplier, and the UX is respectively connected with the second The operating circuit is connected to the first operating circuit.

结合第二方面，在第二方面的某些实现方式中，该方法还包括：该比较电路在该比较电路确定该第一目标数据和/或该第二目标数据为该第一集合中的数据的情况下，向该至少一个第一寄存器输出取值为1的使能信号；该至少一个第一寄存器根据该取值为1的使能信号，将该第一目标数据和该第二目标数据输出至该第一操作电路。With reference to the second aspect, in some implementations of the second aspect, the method further includes: the comparison circuit determines that the first target data and/or the second target data are data in the first set In the case of , an enable signal with a value of 1 is output to the at least one first register; the at least one first register outputs the first target data and the second target data according to the enable signal with a value of 1 output to the first operating circuit.

结合第二方面，在第二方面的某些实现方式中，该方法还包括：该比较电路在该比较电路确定该第一目标数据和该第二目标数据均不是该第一集合中的数据的情况下，向该至少一个第一寄存器输出取值为0的使能信号；取反电路对该比较电路输出的该取值为0的使能信号进行取反操作，得到取值为1的使能信号，并将该取值为1的使能信号输出至该至少一个第二寄存器，该取反电路分别与该比较电路和该至少一个第二寄存器连接；该至少一个第二寄存器基于该取值为1的使能信号，将获取的该第一目标数据和该第二目标数据输出至该第二操作电路。With reference to the second aspect, in some implementations of the second aspect, the method further includes: the comparison circuit determines that neither the first target data nor the second target data is data in the first set when the comparison circuit determines In this case, an enable signal with a value of 0 is output to the at least one first register; enable signal, and output the enable signal with a value of 1 to the at least one second register, and the negation circuit is respectively connected to the comparison circuit and the at least one second register; the at least one second register is based on the fetch An enable signal with a value of 1 outputs the acquired first target data and the second target data to the second operating circuit.

结合第二方面，在第二方面的某些实现方式中，该取反电路为非门。With reference to the second aspect, in some implementation manners of the second aspect, the negation circuit is a NOT gate.

结合第二方面，在第二方面的某些实现方式中，该MUX在接收到该取值为1的使能信号后，将该第一结果作为该矩阵乘法器的输出；或该MUX在接收到该取值为0的使能信号后，将该第二结果作为该矩阵乘法器的输出。In conjunction with the second aspect, in some implementations of the second aspect, after receiving the enable signal with a value of 1, the MUX uses the first result as the output of the matrix multiplier; or the MUX receives the After receiving the enable signal whose value is 0, the second result is used as the output of the matrix multiplier.

第三方面，提供了一种计算设备，包括至少一个处理器和至少一个存储器，可选地，还包括输入输出接口。其中该至少一个处理器用于控制该输入输出接口收发信息，该至少一个存储器用于存储计算机程序，该至少一个处理器用于从至少一个存储器中调用并运行该计算机程序，使得该计算设备执行第二方面或第二方面任意一种可能的实现方式中的方法。In a third aspect, a computing device is provided, including at least one processor and at least one memory, and optionally, an input/output interface. The at least one processor is used to control the input and output interface to send and receive information, the at least one memory is used to store a computer program, and the at least one processor is used to call and run the computer program from the at least one memory, so that the computing device executes the second A method in any possible implementation of the aspect or the second aspect.

可选地，该至少一个处理器可以是通用处理器，可以通过硬件来实现也可以通过软件来实现。当通过硬件实现时，该至少一个处理器可以是逻辑电路、集成电路等；当通过软件来实现时，该至少一个处理器可以是一个通用处理器，通过读取至少一个存储器中存储的软件代码来实现，该至少一个存储器可以集成在至少一个处理器中，可以位于该至少一个处理器之外，独立存在。Optionally, the at least one processor may be a general-purpose processor, and may be implemented by hardware or software. When implemented by hardware, the at least one processor can be a logic circuit, an integrated circuit, etc.; when implemented by software, the at least one processor can be a general-purpose processor, by reading the software code stored in at least one memory For implementation, the at least one memory may be integrated in at least one processor, or may be located outside the at least one processor and exist independently.

第四方面，提供了一种芯片，该芯片包括如第一方面或第一方面任意一种可能的实现方式中的矩阵乘法器。In a fourth aspect, a chip is provided, and the chip includes the matrix multiplier in the first aspect or any possible implementation manner of the first aspect.

第五方面，提供了一种芯片，该芯片获取指令并执行该指令来实现上述第二方面以及第二方面的任意一种实现方式中的方法。In a fifth aspect, a chip is provided, and the chip acquires an instruction and executes the instruction to implement the above-mentioned second aspect and the method in any implementation manner of the second aspect.

可选地，作为一种实现方式，该芯片包括处理器与数据接口，该处理器通过该数据接口读取存储器上存储的指令，执行上述第二方面以及第二方面的任意一种实现方式中的方法。Optionally, as an implementation manner, the chip includes a processor and a data interface, and the processor reads instructions stored in the memory through the data interface, and executes the second aspect and any implementation manner of the second aspect. Methods.

可选地，作为一种实现方式，该芯片还可以包括存储器，该存储器中存储有指令，该处理器用于执行该存储器上存储的指令，当该指令被执行时，该处理器用于执行第一方面以及第二方面中的任意二种实现方式中的方法。Optionally, as an implementation manner, the chip may further include a memory, the memory stores instructions, the processor is used to execute the instructions stored in the memory, and when the instructions are executed, the processor is used to execute the first A method in any two implementation manners of the aspect and the second aspect.

第六方面，提供了一种包含指令的计算机程序产品，当该指令被计算机运行时，使得该计算机执行如上述第二方面以及第二方面的任意一种实现方式中的方法。In a sixth aspect, a computer program product including instructions is provided, and when the instructions are executed by a computer, the computer is made to execute the method in the above-mentioned second aspect and any one of the implementation manners of the second aspect.

第七方面，提供了一种计算机可读存储介质，包括计算机程序指令，当该计算机程序指令由计算机执行时，该计算机执行如上述第二方面以及第二方面的任意一种实现方式中的方法。In a seventh aspect, there is provided a computer-readable storage medium, including computer program instructions. When the computer program instructions are executed by a computer, the computer executes the method in any implementation manner of the above-mentioned second aspect and the second aspect. .

作为示例，这些计算机可读存储包括但不限于如下的一个或者多个：只读存储器（read-only memory，ROM）、可编程ROM（programmable ROM，PROM）、可擦除的PROM（erasablePROM，EPROM）、Flash存储器、电EPROM（electrically EPROM，EEPROM）以及硬盘驱动器（harddrive）。As an example, these computer-readable storages include, but are not limited to, one or more of the following: read-only memory (read-only memory, ROM), programmable ROM (programmable ROM, PROM), erasable PROM (erasable PROM, EPROM ), Flash memory, electrical EPROM (electrically EPROM, EEPROM) and hard drive (harddrive).

可选地，作为一种实现方式，上述存储介质具体可以是非易失性存储介质。Optionally, as an implementation manner, the foregoing storage medium may specifically be a nonvolatile storage medium.

附图说明Description of drawings

图1是本申请实施例提供的一种矩阵乘法器100的示意性框图。FIG. 1 is a schematic block diagram of a matrix multiplier 100 provided by an embodiment of the present application.

图2是本申请实施例提供的另一种矩阵乘法器200的示意性框图。FIG. 2 is a schematic block diagram of another matrix multiplier 200 provided by an embodiment of the present application.

图3是本申请实施例提供的一种矩阵相乘的方法的示意性框图。Fig. 3 is a schematic block diagram of a matrix multiplication method provided by an embodiment of the present application.

图4是本申请实施例提供的一种矩阵相乘的装置400的示意性框图。FIG. 4 is a schematic block diagram of an apparatus 400 for matrix multiplication provided by an embodiment of the present application.

图5是本申请实施例提供的一种计算设备1500的架构示意图。FIG. 5 is a schematic structural diagram of a computing device 1500 provided by an embodiment of the present application.

具体实施方式Detailed ways

下面将结合附图，对本申请中的技术方案进行描述。The technical solution in this application will be described below with reference to the accompanying drawings.

本申请将围绕包括多个设备、组件、模块等的系统来呈现各个方面、实施例或特征。应当理解和明白的是，各个系统可以包括另外的设备、组件、模块等，并且/或者可以并不包括结合附图讨论的所有设备、组件、模块等。此外，还可以使用这些方案的组合。The present application presents various aspects, embodiments or features in terms of a system comprising a number of devices, components, modules and the like. It is to be understood and appreciated that the various systems may include additional devices, components, modules, etc. and/or may not include all of the devices, components, modules etc. discussed in connection with the figures. In addition, combinations of these schemes can also be used.

另外，在本申请实施例中，“示例的”、“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言，使用示例的一词旨在以具体方式呈现概念。In addition, in the embodiments of the present application, words such as "exemplary" and "for example" are used as examples, illustrations or explanations. Any embodiment or design described herein as "example" is not to be construed as preferred or advantageous over other embodiments or designs. Rather, the use of the word example is intended to present concepts in a concrete manner.

本申请实施例中，“相应的（corresponding，relevant）”和“对应的(corresponding)”有时可以混用，应当指出的是，在不强调其区别时，其所要表达的含义是一致的。In the embodiments of the present application, "corresponding (corresponding, relevant)" and "corresponding (corresponding)" may sometimes be used interchangeably. It should be noted that when the difference is not emphasized, the meanings they intend to express are consistent.

本申请实施例描述的业务场景是为了更加清楚地说明本申请实施例的技术方案，并不构成对于本申请实施例提供的技术方案的限定，本领域普通技术人员可知，随着网络架构的演变和新业务场景的出现，本申请实施例提供的技术方案对于类似的技术问题，同样适用。The business scenarios described in the embodiments of the present application are to illustrate the technical solutions of the embodiments of the present application more clearly, and do not constitute limitations on the technical solutions provided by the embodiments of the present application. With the emergence of new business scenarios, the technical solutions provided by the embodiments of this application are also applicable to similar technical problems.

在本说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此，在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例，而是意味着“一个或多个但不是所有的实施例”，除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”，除非是以其他方式另外特别强调。Reference to "one embodiment" or "some embodiments" or the like in this specification means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in other embodiments," etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless specifically stated otherwise. The terms "including", "comprising", "having" and variations thereof mean "including but not limited to", unless specifically stated otherwise.

本申请中，“至少一个”是指一个或者多个，“多个”是指两个或两个以上。“和/或”，描述关联对象的关联关系，表示可以存在三种关系，例如，A和/或B，可以表示：包括单独存在A，同时存在A和B，以及单独存在B的情况，其中A，B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达，是指的这些项中的任意组合，包括单项（个）或复数项（个）的任意组合。例如，a，b，或c中的至少一项（个），可以表示：a，b，c，a-b，a-c，b-c，或a-b-c，其中a，b，c可以是单个，也可以是多个。In this application, "at least one" means one or more, and "multiple" means two or more. "And/or" describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and/or B, which may indicate: including the existence of A alone, the existence of A and B at the same time, and the existence of B alone, where A, B can be singular or plural. The character "/" generally indicates that the contextual objects are an "or" relationship. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c can be single or multiple .

矩阵乘法（matric multiplication，MM）是神经网络、机器学习等现代人工智能相关技术中很重要的数学运算之一。一个示例，可以通过矩阵乘法器进行矩阵乘法的操作。相关的技术方案中，由于矩阵乘法中涉及许多的乘法运算和加法运算，因此，该相关技术方案中的矩阵乘法器包括常规的乘法电路，该常规的乘法电路中需要包括多个加法器、移位器以及多个乘法器。在进行矩阵乘法操作时该常规的乘法电路具有较高的功耗。Matrix multiplication (matric multiplication, MM) is one of the most important mathematical operations in modern artificial intelligence related technologies such as neural networks and machine learning. As an example, the operation of matrix multiplication can be performed by a matrix multiplier. In the related technical solution, since many multiplication operations and addition operations are involved in the matrix multiplication, the matrix multiplier in the related technical solution includes a conventional multiplication circuit, which needs to include multiple adders, shifters, bit registers and multiple multipliers. The conventional multiplication circuit has relatively high power consumption when performing matrix multiplication operations.

有鉴于此，本申请实施例提供了一种矩阵乘法器，该矩阵乘法器在进行矩阵乘法时，可以节省矩阵乘法器的功耗。In view of this, an embodiment of the present application provides a matrix multiplier, which can save power consumption of the matrix multiplier when performing matrix multiplication.

下面先结合图1，对本申请实施例提供的一种矩阵乘法器进行详细描述。A matrix multiplier provided by an embodiment of the present application will be described in detail below in conjunction with FIG. 1 .

图1是本申请实施例提供的一种矩阵乘法器100的示意性框图。如图1所示，该矩阵乘法器100可以包括：比较电路110，第一操作电路120，下面分别对比较电路110以及第一操作电路120的功能进行详细描述。FIG. 1 is a schematic block diagram of a matrix multiplier 100 provided by an embodiment of the present application. As shown in FIG. 1 , the matrix multiplier 100 may include: a comparison circuit 110 and a first operation circuit 120 , and the functions of the comparison circuit 110 and the first operation circuit 120 will be described in detail below.

应理解，为了便于描述，下面以矩阵乘法器100用于对第一矩阵和第二矩阵进行矩阵乘法为例进行说明。It should be understood that, for ease of description, the matrix multiplier 100 is used to perform matrix multiplication on the first matrix and the second matrix as an example for description below.

比较电路110，用于获得第一矩阵中的第一目标数据以及第二矩阵中的第二目标数据，并确定该第一目标数据以及第二目标数据中的至少一个数据是否为第一集合中的数据，其中，该第一集合中的数据可以包括但不限于：0，±2ⁿ，n为整数。也即n可以是正整数，也可以是0，或者还可以是负整数，本申请实施例对此不做具体限定。A comparison circuit 110, configured to obtain the first target data in the first matrix and the second target data in the second matrix, and determine whether at least one of the first target data and the second target data is in the first set data, wherein, the data in the first set may include but not limited to: 0, ±2 ⁿ , n is an integer. That is, n may be a positive integer, may also be 0, or may also be a negative integer, which is not specifically limited in this embodiment of the present application.

也就是说，一个示例，比较电路110可以确定第一目标数据是否为第一集合中的数据。另一个示例，比较电路110可以确定第二目标数据是否为第一集合中的数据。另一个示例，比较电路110可以确定第一目标数据和第二目标数据是否均为第一集合中的数据。That is, as an example, the comparison circuit 110 may determine whether the first target data is data in the first set. As another example, the comparison circuit 110 may determine whether the second target data is data in the first set. As another example, the comparison circuit 110 may determine whether the first target data and the second target data are both data in the first set.

需要说明的是，本申请实施例对比较电路110判断第一目标数据和/或第二目标数据是否为第一集合中的数据的顺序不做具体限定，可以是先对第一目标数据进行判断，或者还可以先对第二目标数据进行判断，或者还可以同时对第一目标数据和第二目标数据进行判断。It should be noted that the embodiment of the present application does not specifically limit the order in which the comparison circuit 110 judges whether the first target data and/or the second target data are data in the first set, and the first target data may be judged first , or the second target data can be judged first, or the first target data and the second target data can be judged at the same time.

第一操作电路120，用于根据第一目标数据和/或第二目标数据为第一集合中的数据，输出第一目标数据和第二目标数据相乘的第一结果，该第一结果包括：0或第三数据。也就是说，第一操作电路120可以在比较电路110判断第一目标数据以及第二目标数据中有至少一个数据为第一集合中的数据的情况下，直接输出第一目标数据和第二目标数据相乘的第一结果为0或第三数据。The first operation circuit 120 is configured to output a first result of multiplying the first target data and the second target data according to the first target data and/or the second target data as data in the first set, the first result including : 0 or third data. That is to say, the first operation circuit 120 can directly output the first target data and the second target data when the comparison circuit 110 judges that at least one of the first target data and the second target data is data in the first set. The first result of data multiplication is 0 or the third data.

应理解，上述第三数据的实现方式有多种，本申请实施例对此不做具体限定，具体是根据第一目标数据和/或第二目标数据的取值确定的。一种可能实现方式中，第一操作电路120输出的第三数据是对第一目标数据左移n位或n的绝对值（|n|）位得到的。另一种可能的实现方式中，第一操作电路120输出的第三数据是对第一目标数据左移n位或n的绝对值（|n|）位并取反得到的。另一种可能的实现方式中，第一操作电路120输出的第三数据是对第一目标数据右移n位或n的绝对值（|n|）位得到的。另一种可能的实现方式中，第一操作电路120输出的第三数据是对第一目标数据右移n位或n的绝对值（|n|）位并取反得到的。另一种可能的实现方式中，第一操作电路120输出的第三数据是对第二目标数据左移n位或n的绝对值（|n|）位得到的。另一种可能的实现方式中，第一操作电路120输出的第三数据是对第二目标数据左移n位或n的绝对值（|n|）位并取反得到的。另一种可能的实现方式中，第一操作电路120输出的第三数据是对第二目标数据右移n位或n的绝对值（|n|）位得到的。另一种可能的实现方式中，第一操作电路120输出的第三数据是对第二目标数据右移n位或n的绝对值（|n|）位并取反得到的。It should be understood that there are many ways to implement the above third data, which is not specifically limited in this embodiment of the present application, and is specifically determined according to values of the first target data and/or the second target data. In a possible implementation manner, the third data output by the first operation circuit 120 is obtained by shifting the first target data to the left by n bits or n absolute value (|n|) bits. In another possible implementation manner, the third data output by the first operation circuit 120 is obtained by shifting the first target data to the left by n bits or n absolute value (|n|) bits and inverting it. In another possible implementation manner, the third data output by the first operation circuit 120 is obtained by right-shifting the first target data by n bits or n absolute value (|n|) bits. In another possible implementation manner, the third data output by the first operation circuit 120 is obtained by right-shifting the first target data by n bits or n absolute value (|n|) bits and inverting it. In another possible implementation manner, the third data output by the first operation circuit 120 is obtained by shifting the second target data to the left by n bits or n absolute value (|n|) bits. In another possible implementation manner, the third data output by the first operation circuit 120 is obtained by shifting the second target data to the left by n bits or n absolute value (|n|) bits and inverting it. In another possible implementation manner, the third data output by the first operation circuit 120 is obtained by right-shifting the second target data by n bits or n absolute value (|n|) bits. In another possible implementation manner, the third data output by the first operation circuit 120 is obtained by right-shifting the second target data by n bits or n absolute value (|n|) bits and inverting it.

下面结合不同的例子，对第一操作电路120输出的第一目标数据和第二目标数据相乘的第一结果进行举例说明。The first result of multiplying the first target data and the second target data output by the first operation circuit 120 will be illustrated below with reference to different examples.

一个示例，假设第一目标数据以及第二目标数据中有至少一个数据为0，第一操作电路120可以直接输出第一目标数据和第二目标数据相乘的第一结果为0。As an example, assuming that at least one of the first target data and the second target data is 0, the first operation circuit 120 may directly output a first result of multiplying the first target data and the second target data as 0.

另一个示例，假设第一目标数据为2ⁿ，第一操作电路120可以对第二目标数据左移或右移n位得到的第三数据，并将该第三数据作为第一目标数据和第二目标数据相乘的第一结果。例如，假设n为正整数，第一操作电路120可以对第二目标数据左移n位得到第三数据。又如，假设n为负整数，第一操作电路120可以对第二目标数据右移n的绝对值（|n|）位得到第三数据。又如，假设n为0，第一操作电路120可以对第二目标数据移动0位得到第三数据，也即输出的第三数据为第二目标数据其本身。For another example, assuming that the first target data is 2 ⁿ , the first operation circuit 120 may shift the second target data to the left or right by n bits to obtain third data, and use the third data as the first target data and the first target data The first result of multiplying the two target data. For example, assuming that n is a positive integer, the first operation circuit 120 may left-shift the second target data by n bits to obtain the third data. For another example, assuming that n is a negative integer, the first operation circuit 120 may right-shift the second target data by n absolute value (|n|) bits to obtain the third data. For another example, assuming that n is 0, the first operation circuit 120 can shift the second target data by 0 bits to obtain the third data, that is, the output third data is the second target data itself.

另一个示例，假设第一目标数据为-2ⁿ，第一操作电路120可以对第二目标数据左移或右移n位得到的第三数据，并将该第三数据作为第一目标数据和第二目标数据相乘的第一结果。例如，假设n为正整数，第一操作电路120可以对第二目标数据左移n位并取反得到第三数据。又如，假设n为负整数，第一操作电路120可以对第二目标数据右移n的绝对值（|n|）位并取反得到第三数据。又如，假设n为0，第一操作电路120可以对第二目标数据移动0位并取反得到第三数据，也即输出的第三数据为对第二目标数据并取反后得到的数据。For another example, assuming that the first target data is -2 ⁿ , the first operation circuit 120 may shift the second target data to the left or right by n bits to obtain third data, and use the third data as the first target data and The first result of multiplying the second target data. For example, assuming that n is a positive integer, the first operation circuit 120 may left-shift the second target data by n bits and invert it to obtain the third data. For another example, assuming that n is a negative integer, the first operation circuit 120 may right-shift the second target data by n absolute value (|n|) bits and invert to obtain the third data. For another example, assuming that n is 0, the first operation circuit 120 can shift 0 bits of the second target data and invert to obtain the third data, that is, the output third data is the data obtained after inverting the second target data .

另一个示例，假设第二目标数据为2ⁿ，第一操作电路120可以对第一目标数据左移或右移n位得到的第三数据，并将该第三数据作为第一目标数据和第二目标数据相乘的第一结果。例如，假设n为正整数，第一操作电路120可以对第一目标数据左移n位得到第三数据。又如，假设n为负整数，第一操作电路120可以对第一目标数据右移n的绝对值（|n|）位得到第三数据。又如，假设n为0，第一操作电路120可以对第一目标数据移动0位得到第三数据，也即输出的第三数据为第一目标数据其本身。For another example, assuming that the second target data is 2 ⁿ , the first operation circuit 120 may shift the first target data to the left or right by n bits to obtain third data, and use the third data as the first target data and the first target data The first result of multiplying the two target data. For example, assuming that n is a positive integer, the first operation circuit 120 can shift the first target data to the left by n bits to obtain the third data. For another example, assuming that n is a negative integer, the first operation circuit 120 may right-shift the first target data by n absolute value (|n|) bits to obtain the third data. For another example, assuming that n is 0, the first operation circuit 120 can shift the first target data by 0 bits to obtain the third data, that is, the output third data is the first target data itself.

另一个示例，假设第二目标数据为-2ⁿ，第一操作电路120可以对第一目标数据左移或右移n位得到的第三数据，并将该第三数据作为第一目标数据和第二目标数据相乘的第一结果。例如，假设n为正整数，第一操作电路120可以对第一目标数据左移n位并取反得到第三数据。又如，假设n为负整数，第一操作电路120可以对第一目标数据右移n的绝对值（|n|）位并取反得到第三数据。又如，假设n为0，第一操作电路120可以对第一目标数据移动0位并取反得到第三数据，也即输出的第三数据为对第一目标数据并取反后得到的数据。For another example, assuming that the second target data is -2 ⁿ , the first operation circuit 120 may shift the first target data to the left or right by n bits to obtain third data, and use the third data as the first target data and The first result of multiplying the second target data. For example, assuming that n is a positive integer, the first operation circuit 120 may left-shift the first target data by n bits and invert it to obtain the third data. For another example, assuming that n is a negative integer, the first operation circuit 120 may right-shift the first target data by n absolute value (|n|) bits and invert to obtain the third data. As another example, assuming that n is 0, the first operation circuit 120 can shift 0 bits of the first target data and invert to obtain the third data, that is, the output third data is the data obtained after inverting the first target data .

可选地，在一些实施例中，比较电路110还用于根据第一目标数据和/或第二目标数据为第一集合中的数据，确定操作码（opcode），该操作码（opcode）的取值用于指示所述第一目标数据和/或所述第二目标数据为0或±2ⁿ。这样，以便于第一操作电路120可以直接根据第一目标数据和/或第二目标数据对应的操作码（opcode）的取值确定第一目标数据和第二目标数据相乘的第一结果。Optionally, in some embodiments, the comparison circuit 110 is further configured to determine an operation code (opcode) for the data in the first set according to the first target data and/or the second target data, and the operation code (opcode) The value is used to indicate that the first target data and/or the second target data is 0 or ±2 ⁿ . In this way, the first operation circuit 120 can directly determine the first result of multiplying the first target data and the second target data according to the value of the operation code (opcode) corresponding to the first target data and/or the second target data.

需要说明的是，比较电路110可以根据第一目标数据和/或第二目标数据是否为0或或±2ⁿ，确定并输出一个opcode，或者还可以确定并输出第一目标数据和第二目标数据分别对应的opcode，本申请实施例对此不做具体限定。It should be noted that the comparison circuit 110 can determine and output an opcode according to whether the first target data and/or the second target data are 0 or ±2 ⁿ , or can also determine and output the first target data and the second target data The opcodes corresponding to the data are not specifically limited in this embodiment of the present application.

举例说明，下面以比较电路110输出第一目标数据和第二目标数据分别对应的opcode为例，对opcode的几种可能的实现方式进行详细描述。For illustration, the following takes the comparison circuit 110 to output the opcode corresponding to the first target data and the second target data respectively as an example, and describes several possible implementations of the opcode in detail.

一个示例，第一目标数据对应的opcode的取值为0，其可以用于指示第一目标数据为0。这样，第一操作电路120可以根据第一目标数据对应的opcode的取值为0，确定第一目标数据和第二目标数据相乘的第一结果为0。In one example, the value of the opcode corresponding to the first target data is 0, which may be used to indicate that the first target data is 0. In this way, the first operation circuit 120 may determine that the first result of multiplying the first target data and the second target data is 0 according to the value of the opcode corresponding to the first target data is 0.

另一个示例，第二目标数据对应的opcode的取值为0，其可以用于指示第二目标数据为0。这样，第一操作电路120可以根据第二目标数据对应的opcode的取值为0，确定第一目标数据和第二目标数据相乘的第一结果为0。In another example, the value of the opcode corresponding to the second target data is 0, which may be used to indicate that the second target data is 0. In this way, the first operation circuit 120 may determine that the first result of multiplying the first target data and the second target data is 0 according to the value of the opcode corresponding to the second target data is 0.

另一个示例，第一目标数据对应的opcode的取值为2ⁿ，其可以用于指示第一目标数据为2ⁿ。这样，第一操作电路120可以根据第一目标数据对应的opcode的取值为2ⁿ以及n的取值，对第二目标数据左移或右移|n|位得到第三数据，并将该第三数据作为第一目标数据和第二目标数据相乘的第一结果。第一操作电路120根据第二目标数据以及n的取值确定第三数据的过程请参考上文中的描述，此处暂不详述。In another example, the value of the opcode corresponding to the first target data is 2 ⁿ , which may be used to indicate that the first target data is 2 ⁿ . In this way, the first operation circuit 120 can shift the second target data to the left ^or right by |n| The third data is used as the first result of multiplying the first target data and the second target data. For the process of the first operating circuit 120 determining the third data according to the second target data and the value of n, please refer to the above description, and details will not be described here.

另一个示例，第一目标数据对应的opcode的取值为-2ⁿ，其可以用于指示第一目标数据为-2ⁿ。这样，第一操作电路120可以根据第一目标数据对应的opcode的取值为2ⁿ以及n的取值，对第二目标数据左移或右移|n|位并取反得到第三数据，并将该第三数据作为第一目标数据和第二目标数据相乘的第一结果。第一操作电路120根据第二目标数据以及n的取值确定第三数据的过程请参考上文中的描述，此处暂不详述。In another example, the value of the opcode corresponding to the first target data is -2 ⁿ , which may be used to indicate that the first target data is -2 ⁿ . In this way, the first operation circuit 120 can shift the second target data to the left or right by |n| bits according to the value of the opcode corresponding to the first target data is 2 ⁿ and the value of n and inverts to obtain the third data, And the third data is used as the first result of multiplying the first target data and the second target data. For the process of the first operating circuit 120 determining the third data according to the second target data and the value of n, please refer to the above description, and details will not be described here.

另一个示例，第二目标数据对应的opcode的取值为2ⁿ，其可以用于指示第二目标数据为2ⁿ。这样，第一操作电路120可以根据第二目标数据对应的opcode的取值为2ⁿ以及n的取值，对第一目标数据左移或右移|n|位得到第三数据，并将该第三数据作为第一目标数据和第二目标数据相乘的第一结果。第一操作电路120根据第一目标数据以及n的取值确定第三数据的过程请参考上文中的描述，此处暂不详述。In another example, the value of the opcode corresponding to the second target data is 2 ⁿ , which may be used to indicate that the second target data is 2 ⁿ . In this way, the first operation circuit 120 can shift the first target data to the left or right by | ⁿ | The third data is used as the first result of multiplying the first target data and the second target data. For the process of the first operating circuit 120 determining the third data according to the first target data and the value of n, please refer to the above description, and details will not be described here.

另一个示例，第二目标数据对应的opcode的取值为-2ⁿ，其可以用于指示第二目标数据为2ⁿ。这样，第一操作电路120可以根据第二目标数据对应的opcode的取值为2ⁿ以及n的取值，对第一目标数据左移或右移|n|位并取反得到第三数据，并将该第三数据作为第一目标数据和第二目标数据相乘的第一结果。第一操作电路120根据第一目标数据以及n的取值确定第三数据的过程请参考上文中的描述，此处暂不详述。In another example, the value of the opcode corresponding to the second target data is -2 ⁿ , which may be used to indicate that the second target data is 2 ⁿ . In this way, the first operation circuit 120 can shift the first target data to the left or right by |n| bits and invert it to obtain the third data according to the value of the opcode corresponding to the second target data is 2 ⁿ and the value of n, And the third data is used as the first result of multiplying the first target data and the second target data. For the process of the first operating circuit 120 determining the third data according to the first target data and the value of n, please refer to the above description, and details will not be described here.

可选地，在一些实施例中，矩阵乘法器100还可以包括至少一个第一寄存器，该至少一个第一寄存器分别与比较电路110、第一操作电路120连接。其中，所述至少一个第一寄存器用于从比较电路110获取第一目标数据和第二目标数据，并将第一目标数据和第二目标数据输出至第一操作电路120。一种可能的实现方式中，至少一个第一寄存器的输入端在从比较电路110接收到第一目标数据和第二目标数据后，可以将该第一目标数据和第二目标数据存储，在其时钟脉冲输入端从时钟控制单元接收到时钟控制信号（clk）后，将该第一目标数据和第二目标数据从输出端传输至第一操作电路120。Optionally, in some embodiments, the matrix multiplier 100 may further include at least one first register, and the at least one first register is connected to the comparison circuit 110 and the first operation circuit 120 respectively. Wherein, the at least one first register is used to obtain the first target data and the second target data from the comparison circuit 110 , and output the first target data and the second target data to the first operation circuit 120 . In a possible implementation manner, after the input end of at least one first register receives the first target data and the second target data from the comparison circuit 110, it can store the first target data and the second target data, and store the first target data and the second target data in its After receiving the clock control signal (clk) from the clock control unit, the clock input terminal transmits the first target data and the second target data to the first operation circuit 120 from the output terminal.

可选地，在一些实施例中，该至少一个第一寄存器还可以从比较电路110接收到opcode，并将其传输至第一操作电路120。一种可能的实现方式中，该至少一个第一寄存器在其时钟脉冲输入端从时钟控制单元接收到时钟控制信号（clk）后，将opcode从输出端传输至第一操作电路120。Optionally, in some embodiments, the at least one first register may also receive an opcode from the comparison circuit 110 and transmit it to the first operation circuit 120 . In a possible implementation manner, the at least one first register transmits the opcode from the output end to the first operation circuit 120 after receiving a clock control signal (clk) from the clock control unit at the clock pulse input end of the at least one first register.

需要说明的是，第一目标数据和第二目标数据可以存储在同一个第一寄存器中，或者也可以存储在不同的第一寄存器中，本申请实施例对此不做具体限定。It should be noted that the first target data and the second target data may be stored in the same first register, or may also be stored in different first registers, which is not specifically limited in this embodiment of the present application.

还需要说明的是，opcode可以与第一目标数据和/或第二目标数据存储在同一个第一寄存器中，或者也可以存储在不同的第一寄存器中，本申请实施例对此也不做具体限定。It should also be noted that the opcode may be stored in the same first register as the first target data and/or the second target data, or may also be stored in a different first register, which is not the case in this embodiment of the present application. Specific limits.

可选地，在一些实施例中，矩阵乘法器100中还可以包括至少一个第二寄存器和第二操作电路，该至少一个第二寄存器和第二操作电路连接。其中，至少一个第二寄存器用于在比较电路110确定第一目标数据和第二目标数据均不是第一集合中的数据的情况下，将获取的第一目标数据和第二目标数据输出至第二操作电路；第二操作电路用于对接收到的第一目标数据和第二目标数据进行常规的乘法运算，输出第一目标数据和第二目标数据相乘的第二结果。Optionally, in some embodiments, the matrix multiplier 100 may further include at least one second register and a second operation circuit, and the at least one second register is connected to the second operation circuit. Wherein, at least one second register is used for outputting the acquired first target data and second target data to the second target data when the comparison circuit 110 determines that neither the first target data nor the second target data is data in the first set. Two operation circuits; the second operation circuit is used to perform a conventional multiplication operation on the received first target data and second target data, and output a second result of multiplying the first target data and the second target data.

应理解，上述实施例中，可以在第一目标数据和/或第二目标数据为第一集合中的数据的情况下，使用第一操作电路120输出第一目标数据和第二目标数据相乘的第一结果，并将该第一结果作为矩阵乘法器100最终输出的结果，这样，可以节省矩阵乘法器100的功耗。如果第一目标数据和第二目标数据均不是第一集合中的数据的情况下，使用第二操作电路进行第一目标数据和第二目标数据的乘法运算得到第二结果，并将该第二结果作为矩阵乘法器100最终输出的结果。It should be understood that, in the above-mentioned embodiment, when the first target data and/or the second target data are data in the first set, the first operation circuit 120 can be used to output the multiplication of the first target data and the second target data and use the first result as the final output result of the matrix multiplier 100, so that the power consumption of the matrix multiplier 100 can be saved. If neither the first target data nor the second target data is the data in the first set, the second operation circuit is used to perform the multiplication operation of the first target data and the second target data to obtain a second result, and the second The result serves as the final output of the matrix multiplier 100 .

一种可能的实现过程中，可以通过数据选择器（multiplexer，MUX）选择将上述第一结果作为矩阵乘法器100最终输出的结果，还是将上述第二结果作为矩阵乘法器100最终输出的结果。In a possible implementation process, a data selector (multiplexer, MUX) may be used to select whether to use the above-mentioned first result as the final output result of the matrix multiplier 100 or to use the above-mentioned second result as the final output result of the matrix multiplier 100 .

本申请实施例中可以通过使能信号来决定是使用第二操作电路确定第一目标数据和第二目标数据相乘的结果，还是使用第一操作电路120确定第一目标数据和第二目标数据相乘的结果。In the embodiment of the present application, the enable signal can be used to determine whether to use the second operation circuit to determine the result of multiplying the first target data and the second target data, or to use the first operation circuit 120 to determine the first target data and the second target data The result of the multiplication.

举例说明，至少一个第一寄存器的输入端为第一目标数据、第二目标数据、对应的opcode以及第一使能信号，至少一个第二寄存器的输入端为第一目标数据、第二目标数据以及第二使能信号，其中，第二使能信号的取值和第一使能信号的取值相反。一个示例，假设第一目标数据和/或第二目标数据为第一集合中的数据，比较电路110在这种情况下会输出取值为1的第一使能信号。该至少一个第一寄存器根据接收到取值为1的第一使能信号，将从输入端获取的第一目标数据、第二目标数据以及对应的opcode从其输出端输出。该至少一个第二寄存器根据接收到取值为0的第二使能信号，不会将从其输入端获取的第一目标数据以及第二目标数据输出，也即至少一个第二寄存器在接收到值为0的使能信号后，至少一个第二寄存器的输出端数据保持不变，其输出端的数据还是上个时钟周期的输出数据，并不会随着输入端数据的改变而改变。另一个示例，假设第一目标数据以及第二目标数据均不是第一集合中的数据，比较电路110在这种情况下会输出取值为0的第一使能信号。该至少一个第一寄存器根据接收到取值为0的第一使能信号，不会将从其输入端获取的第一目标数据、第二目标数据以及对应的opcode输出，也即至少一个第一寄存器在接收到值为0的使能信号后，至少一个第一寄存器的输出端数据保持不变，其输出端的数据还是上个时钟周期的输出数据，并不会随着输入端数据的改变而改变。该至少一个第二寄存器根据接收到取值为1的第二使能信号，将从输入端获取的第一目标数据以及第二目标数据从其输出端输出。For example, the input terminal of at least one first register is the first target data, the second target data, the corresponding opcode and the first enable signal, and the input terminal of at least one second register is the first target data, the second target data and a second enabling signal, wherein the value of the second enabling signal is opposite to that of the first enabling signal. As an example, assuming that the first target data and/or the second target data are data in the first set, the comparison circuit 110 will output a first enable signal with a value of 1 in this case. The at least one first register outputs the first target data, the second target data and the corresponding opcode obtained from the input terminal from its output terminal according to receiving the first enable signal with a value of 1. The at least one second register will not output the first target data and the second target data obtained from its input terminal according to receiving the second enable signal with a value of 0, that is, at least one second register receives After the enable signal with a value of 0, the data at the output terminal of at least one second register remains unchanged, and the data at the output terminal is still the output data of the last clock cycle, and will not change with the change of the data at the input terminal. For another example, assuming that neither the first target data nor the second target data is data in the first set, the comparison circuit 110 will output a first enable signal with a value of 0 in this case. The at least one first register will not output the first target data, second target data and corresponding opcode obtained from its input terminal according to receiving the first enable signal with a value of 0, that is, at least one first After the register receives the enable signal with a value of 0, the data at the output of at least one first register remains unchanged, and the data at the output is still the output data of the previous clock cycle, and will not change with the change of the data at the input. Change. The at least one second register outputs the first target data and the second target data acquired from the input terminal through its output terminal according to receiving the second enable signal with a value of 1.

可选地，在一些实施例中，可以通过取反电路实现上述第一使能信号和第二使能信号的取值相反的操作。一个示例，该取反电路为非门。Optionally, in some embodiments, an operation in which the values of the first enable signal and the second enable signal are reversed may be realized by an inversion circuit. As an example, the inverting circuit is a NOT gate.

可选地，在一些实施例中，上述MUX还可以根据第一使能信号实现选择第一结果作为矩阵乘法器100最终输出的结果，还是选择第二结果作为矩阵乘法器100最终输出的结果。具体的，一种可能的实现方式中，比较电路110输出的第一使能信号的取值为1，MUX可以根据第一使能信号的取值，将第一结果作为矩阵乘法器100最终输出的结果。另一种可能的实现方式中，比较电路110输出的第一使能信号的取值为0，MUX可以根据第一使能信号的取值，将第二结果作为矩阵乘法器100最终输出的结果。Optionally, in some embodiments, the above-mentioned MUX may also select the first result as the final output result of the matrix multiplier 100 or select the second result as the final output result of the matrix multiplier 100 according to the first enable signal. Specifically, in a possible implementation manner, the value of the first enable signal output by the comparison circuit 110 is 1, and the MUX can use the first result as the final output of the matrix multiplier 100 according to the value of the first enable signal the result of. In another possible implementation, the value of the first enable signal output by the comparison circuit 110 is 0, and the MUX can use the second result as the final output result of the matrix multiplier 100 according to the value of the first enable signal .

上述矩阵乘法器可以在对两个矩阵中的数据进行乘法操作时，对于特殊的数据（例如，该数据为0或2ⁿ），在对该数据进行乘法运算时，可以直接输出对应的特殊的结果。这样，可以减少由于进行常规的乘法运算所带来的较高的功耗。并且，由于矩阵乘法器是对两个矩阵中的数据进行乘法操作，会涉及大量的操作运算，功耗的节省所带来的收益更为可观。The above matrix multiplier can directly output the corresponding special data (for example, the data is 0 or 2 ⁿ ) when multiplying the data in the two matrices. result. In this way, high power consumption caused by conventional multiplication operations can be reduced. Moreover, since the matrix multiplier performs multiplication operations on data in two matrices, a large number of operations are involved, and the benefits brought about by saving power consumption are even more considerable.

下面结合图2，对本申请实施例提供的另一种矩阵乘法器200进行详细描述。应理解，图2的例子仅仅是为了帮助本领域技术人员理解本申请实施例，而非要将申请实施例限制于图2所示例的具体数值或具体场景。本领域技术人员根据图2所给出的下面的例子，显然可以进行各种等价的修改或变化，这样的修改和变化也落入本申请实施例的范围内。Another matrix multiplier 200 provided by the embodiment of the present application will be described in detail below with reference to FIG. 2 . It should be understood that the example in FIG. 2 is only intended to help those skilled in the art understand the embodiment of the present application, and is not intended to limit the embodiment of the application to the specific values or specific scenarios illustrated in FIG. 2 . Those skilled in the art can obviously make various equivalent modifications or changes according to the following example given in FIG. 2 , and such modifications and changes also fall within the scope of the embodiments of the present application.

图2是本申请实施例提供的另一种矩阵乘法器200的示意性框图。如图2所示，该矩阵乘法器200可以包括：比较器210、寄存器1、寄存器2、寄存器3、寄存器4、第一操作电路220、第二操作电路230，MUX 240，非门250。FIG. 2 is a schematic block diagram of another matrix multiplier 200 provided by an embodiment of the present application. As shown in FIG. 2 , the matrix multiplier 200 may include: a comparator 210, a register 1, a register 2, a register 3, a register 4, a first operating circuit 220, a second operating circuit 230, a MUX 240, and a NOT gate 250.

需要说明的是，本申请实施例对第一寄存器和第二寄存器的数量不做具体限定，为了便于描述，图2中以两个第一寄存器（寄存器3、寄存器4），两个第二寄存器（寄存器1、寄存器2）为例进行举例说明。It should be noted that the embodiment of the present application does not specifically limit the number of first registers and second registers. For the convenience of description, two first registers (register 3 and register 4) and two second registers are used in Figure 2 (Register 1, Register 2) as an example to illustrate.

参见图2，操作数A（对应于上文中的第一目标数据）和操作数B（对应于上文中的第二目标数据）输入至比较器210，比较器210根据上述方法确定操作数A和操作数B中是否至少为第一集合中的数据，并输出对应的使能信号，其中，该第一集合中的数据可以包括但不限于：0，±2ⁿ，n为整数。Referring to Fig. 2, operand A (corresponding to the first target data above) and operand B (corresponding to the second target data above) are input to comparator 210, and comparator 210 determines operand A and Whether the operand B is at least the data in the first set, and output a corresponding enable signal, wherein the data in the first set may include but not limited to: 0, ±2 ⁿ , n is an integer.

一个示例，下面以操作数A的取值为2，操作数B的取值为3作为示例，对矩阵乘法器200中包括的各个部分进行详细的描述。As an example, the following takes the value of operand A as 2 and the value of operand B as 3 as an example to describe each part included in the matrix multiplier 200 in detail.

比较器210，根据操作数A的取值为第一集合中的数据（2ⁿ，n的取值为1），输出操作数A、操作数A对应的opcode（opcode的取值为1，指示操作数A的取值为2¹）、操作数B以及取值为1的使能信号。Comparator 210, according to the value of operand A is the data in the first set (2 ⁿ , the value of n is 1), output operand A, the opcode corresponding to operand A (the value of opcode is 1, indicating The value of operand A is 2 ¹ ), operand B and an enable signal with value 1.

非门250，对从比较器210输出的取值为1的使能信号进行取反操作，获得取值为0的使能信号，并将取值为0的使能信号输出至寄存器1和寄存器2。The NOT gate 250 performs an inversion operation on the enable signal output from the comparator 210 with a value of 1 to obtain an enable signal with a value of 0, and outputs the enable signal with a value of 0 to register 1 and register 2.

寄存器1，其输入端（D1）接收到的数据包括操作数A，其使能端（EN1）接收到的使能信号为非门250输出的取值为0的使能信号，其时钟脉冲输入端（clk 1）接收到的是时钟控制单元发出的时钟信号（clk）。在clk 1端的时钟脉冲信号上升沿（即低电平转为高电平）到来时，由于使能信号的取值为0，寄存器1的输出端（Q1）不会输出输入端（D1）接收到的操作数A。Register 1, the data received by its input terminal (D1) includes operand A, the enable signal received by its enable terminal (EN1) is the enable signal output by the NOT gate 250 with a value of 0, and its clock pulse input The terminal (clk 1) receives the clock signal (clk) sent by the clock control unit. When the rising edge of the clock pulse signal at the clk 1 terminal (that is, low level to high level) arrives, since the value of the enable signal is 0, the output terminal (Q1) of register 1 will not output the input terminal (D1) to receive to the operand A.

寄存器2，其输入端（D2）接收到的数据包括操作数B，其使能端（EN2）接收到的使能信号为非门250输出的取值为0的使能信号，其时钟脉冲输入端（clk 2）接收到的是时钟控制单元发出的时钟信号（clk）。在clk 2端的时钟脉冲信号上升沿（即低电平转为高电平）到来时，由于使能信号的取值为0，寄存器2的输出端（Q2）不会输出输入端（D2）接收到的操作数B。Register 2, the data received by its input terminal (D2) includes operand B, the enable signal received by its enable terminal (EN2) is the enable signal output by the NOT gate 250 with a value of 0, and its clock pulse input The terminal (clk 2) receives the clock signal (clk) sent by the clock control unit. When the rising edge of the clock pulse signal at the clk 2 terminal (that is, low level to high level) arrives, since the value of the enable signal is 0, the output terminal (Q2) of register 2 will not output the input terminal (D2) to receive to operand B.

寄存器3，其输入端（D3）接收到的数据包括操作数A以及操作数A对应的opcode（opcode的取值为1，指示操作数A的取值为2¹），其使能端（EN3）接收到的使能信号的取值为1，其时钟脉冲输入端（clk 3）接收到的是时钟控制单元发出的时钟信号（clk）。在clk 3端的时钟脉冲信号上升沿（即低电平转为高电平）到来时，由于使能信号的取值为1，寄存器3将输入端（D3）的操作数A以及操作数A对应的opcode（opcode的取值为1，指示操作数A的取值为2¹）从输出端（Q3）输出。Register 3, the data received by its input terminal (D3) includes operand A and the opcode corresponding to operand A (the value of opcode is 1, indicating that the value of operand A is 2 ¹ ), and its enable terminal (EN3 ) the value of the enable signal received is 1, and the clock pulse input terminal (clk 3) receives the clock signal (clk) sent by the clock control unit. When the rising edge of the clock pulse signal at the clk 3 terminal (that is, low level to high level) arrives, since the value of the enable signal is 1, the register 3 will correspond to the operand A of the input terminal (D3) and the operand A The opcode (the value of opcode is 1, indicating that the value of operand A is 2 ¹ ) is output from the output terminal (Q3).

寄存器4，其输入端（D4）接收到的数据包括操作数B，其使能端（EN4）接收到的使能信号的取值为1，其时钟脉冲输入端（clk 4）接收到的是时钟控制单元发出的时钟信号（clk）。在clk 4端的时钟脉冲信号上升沿（即低电平转为高电平）到来时，由于使能信号的取值为1，寄存器4将输入端（D4）的操作数B从输出端（Q4）输出。Register 4, the data received by its input terminal (D4) includes operand B, the value of the enable signal received by its enable terminal (EN4) is 1, and the value received by its clock pulse input terminal (clk 4) is Clock signal (clk) from the clock control unit. When the rising edge of the clock pulse signal at the clk 4 terminal (that is, low level to high level) arrives, since the value of the enable signal is 1, register 4 transfers the operand B of the input terminal (D4) from the output terminal (Q4 ) output.

第一操作电路220，用于从寄存器3和寄存器4获取其输出的数据。具体的，第一操作电路220其输入端获取的数据包括操作数A、操作数A对应的opcode（opcode的取值为1，指示操作数A的取值为2¹）以及操作数B。第一操作电路220根据opcode的取值为1确定操作数A的取值为2¹，因此，第一操作电路220可以对操作数B向左移1位，并将对操作数B向左移1位后的结果作为第一操作电路220输出端的输出结果（第一结果）。The first operation circuit 220 is used to obtain the output data from the register 3 and the register 4 . Specifically, the data acquired by the input terminal of the first operation circuit 220 includes operand A, the opcode corresponding to the operand A (the value of opcode is 1, indicating that the value of operand A is 2 ¹ ) and operand B. The first operation circuit 220 determines the value of operand A to be 2 ¹ according to the value of opcode 1, therefore, the first operation circuit 220 can shift operand B to the left by 1 bit, and shift operand B to the left The result after 1 bit is used as the output result (first result) of the output terminal of the first operation circuit 220 .

第二操作电路230，用于从寄存器1和寄存器2获取其输出的数据。由于寄存器1和寄存器2的输出端没有输出操作数A和操作数B，还是上一个时钟周期寄存器1和寄存器2输出的数据，因此，第二操作电路230不进行操作数A和操作数B的乘法操作。第二操作电路230输出端的乘法输出结果（第二结果）还是上一个时钟周期的计算结果。The second operation circuit 230 is used for obtaining output data from the register 1 and the register 2 . Since the output terminals of register 1 and register 2 do not output operand A and operand B, or the data output by register 1 and register 2 in the last clock cycle, the second operation circuit 230 does not perform operand A and operand B. multiplication operation. The multiplication output result (second result) at the output terminal of the second operation circuit 230 is still the calculation result of the last clock cycle.

MUX 240，其输入端接收到的数据包括第一结果、第二结果，其使能端（EN5）接收到的使能信号为取值为1的使能信号。MUX 240根据取值为1的使能信号，将第一操作电路220输出的第一结果作为矩阵乘法器200的最终输出结果。The data received at the input terminal of MUX 240 includes the first result and the second result, and the enable signal received at the enable terminal (EN5) is an enable signal with a value of 1. The MUX 240 uses the first result output by the first operation circuit 220 as the final output result of the matrix multiplier 200 according to the enable signal whose value is 1.

另一个示例，下面以操作数A的取值为5，操作数B的取值为3作为示例，对矩阵乘法器200中包括的各个部分进行详细的描述。For another example, the following takes the value of operand A as 5 and the value of operand B as 3 as an example to describe each part included in the matrix multiplier 200 in detail.

比较器210，根据操作数A的取值和操作数B的取值均不是第一集合中的数据（第一集合中的数据包括0，2ⁿ，n为正数），比较器210输出操作数A、操作数B以及取值为1的使能信号。Comparator 210, according to the value of operand A and the value of operand B are not data in the first set (the data in the first set includes 0, 2 ⁿ , n is a positive number), comparator 210 output operation Number A, operand B, and an enable signal with a value of 1.

非门250，对从比较器210输出的取值为0的使能信号进行取反操作，获得取值为1的使能信号，并将取值为1的使能信号输出至寄存器1和寄存器2。The NOT gate 250 performs an inversion operation on the enable signal with a value of 0 output from the comparator 210 to obtain an enable signal with a value of 1, and outputs the enable signal with a value of 1 to register 1 and register 2.

寄存器1，其输入端（D1）接收到的数据包括操作数A，其使能端（EN1）接收到的使能信号为非门250输出的取值为1的使能信号，其时钟脉冲输入端（clk 1）接收到的是时钟控制单元发出的时钟信号（clk）。在clk 1端的时钟脉冲信号上升沿（即低电平转为高电平）到来时，由于使能信号的取值为1，寄存器1将输入端（D1）的操作数A从输出端（Q1）输出。Register 1, the data received by its input terminal (D1) includes operand A, the enable signal received by its enable terminal (EN1) is the enable signal output by the NOT gate 250 with a value of 1, and its clock pulse input The terminal (clk 1) receives the clock signal (clk) sent by the clock control unit. When the rising edge of the clock pulse signal at the clk 1 terminal (that is, low level to high level) arrives, since the value of the enable signal is 1, register 1 transfers the operand A of the input terminal (D1) from the output terminal (Q1 ) output.

寄存器2，其输入端（D2）接收到的数据包括操作数B，其使能端（EN2）接收到的使能信号为非门250输出的取值为1的使能信号，其时钟脉冲输入端（clk 2）接收到的是时钟控制单元发出的时钟信号（clk）。在clk 2端的时钟脉冲信号上升沿（即低电平转为高电平）到来时，由于使能信号的取值为1，寄存器2将输入端（D2）的操作数B从输出端（Q2）输出。Register 2, the data received by its input terminal (D2) includes operand B, the enable signal received by its enable terminal (EN2) is an enable signal with a value of 1 output by the NOT gate 250, and its clock pulse input The terminal (clk 2) receives the clock signal (clk) sent by the clock control unit. When the rising edge of the clock pulse signal at the clk 2 terminal (that is, low level to high level) arrives, since the value of the enable signal is 1, register 2 transfers the operand B of the input terminal (D2) from the output terminal (Q2 ) output.

寄存器3，其输入端（D3）接收到的数据包括操作数A，其使能端（EN3）接收到的使能信号的取值为0，其时钟脉冲输入端（clk 3）接收到的是时钟控制单元发出的时钟信号（clk）。在clk 3端的时钟脉冲信号上升沿（即低电平转为高电平）到来时，由于使能信号的取值为0，寄存器3的输出端（Q3）不会输出输入端（D3）接收到的操作数A。Register 3, the data received by its input terminal (D3) includes operand A, the value of the enable signal received by its enable terminal (EN3) is 0, and the value received by its clock pulse input terminal (clk 3) is Clock signal (clk) from the clock control unit. When the rising edge of the clock pulse signal at the clk 3 terminal (that is, low level to high level) arrives, since the value of the enable signal is 0, the output terminal (Q3) of register 3 will not output the input terminal (D3) to receive to the operand A.

寄存器4，其输入端（D4）接收到的数据包括操作数B，其使能端（EN4）接收到的使能信号的取值为0，其时钟脉冲输入端（clk 4）接收到的是时钟控制单元发出的时钟信号（clk）。在clk 4端的时钟脉冲信号上升沿（即低电平转为高电平）到来时，由于使能信号的取值为0，寄存器3的输出端（Q4）不会输出输入端（D4）接收到的操作数B。Register 4, the data received by its input terminal (D4) includes operand B, the value of the enable signal received by its enable terminal (EN4) is 0, and the value received by its clock pulse input terminal (clk 4) is Clock signal (clk) from the clock control unit. When the rising edge of the clock pulse signal at the clk 4 terminal (that is, low level to high level) arrives, since the value of the enable signal is 0, the output terminal (Q4) of register 3 will not output the input terminal (D4) to receive to operand B.

第二操作电路230，用于从寄存器1和寄存器2分别获取其输出的操作数A和操作数B，并对操作数A和操作数B进行常规的乘法操作，得到的结果作为第二操作电路230输出端的乘法输出结果（第二结果）。The second operation circuit 230 is used to obtain the output operand A and operand B from register 1 and register 2 respectively, and perform a conventional multiplication operation on operand A and operand B, and obtain the result as the second operation circuit The multiplication output result (second result) at the output of 230 .

第一操作电路220，用于从寄存器3和寄存器4获取其输出的数据。由于寄存器3和寄存器4的输出端没有输出操作数A和操作数B，因此，第一操作电路220输出端的输出结果（第一结果）还是上一个时钟周期的结果。The first operation circuit 220 is used to obtain the output data from the register 3 and the register 4 . Since the output terminals of register 3 and register 4 do not output operand A and operand B, the output result (first result) of the output terminal of the first operation circuit 220 is still the result of the last clock cycle.

MUX 240，其输入端接收到的数据包括第一结果、第二结果，其使能端（EN5）接收到的使能信号为取值为0的使能信号。MUX 240根据取值为0的使能信号，将第二操作电路230输出的第二结果作为矩阵乘法器200的最终输出结果。The data received at the input terminal of MUX 240 includes the first result and the second result, and the enable signal received at the enable terminal (EN5) is an enable signal with a value of 0. The MUX 240 uses the second result output by the second operation circuit 230 as the final output result of the matrix multiplier 200 according to the enable signal whose value is 0.

下面结合图3，对本申请实施例提供的一种矩阵相乘的方法进行详细描述。A method of matrix multiplication provided by the embodiment of the present application will be described in detail below with reference to FIG. 3 .

图3是本申请实施例提供的一种矩阵相乘的方法的示意性框图。该方法应用于矩阵乘法器，该矩阵乘法器用于对第一矩阵和第二矩阵进行矩阵乘的操作。如图3所示，该方法可以包括步骤310-320，下面分别对步骤310-320进行详细描述。Fig. 3 is a schematic block diagram of a matrix multiplication method provided by an embodiment of the present application. The method is applied to a matrix multiplier for performing a matrix multiplication operation on a first matrix and a second matrix. As shown in FIG. 3, the method may include steps 310-320, and the steps 310-320 will be described in detail below.

步骤310：比较电路确定第一目标数据和/或第二目标数据是否为第一集合中的数据，其中，该第一目标数据为该第一矩阵中的数据，该第二目标数据为该第二矩阵中的数据，该第一集合包括：0、±2ⁿ，n为整数。Step 310: The comparison circuit determines whether the first target data and/or the second target data are the data in the first set, wherein the first target data is the data in the first matrix, and the second target data is the first target data The data in the second matrix, the first set includes: 0, ±2 ⁿ , where n is an integer.

步骤320：第一操作电路根据该第一目标数据和/或第二目标数据为该第一集合中的数据，输出该第一目标数据和该第二目标数据相乘的第一结果，该第一结果包括：0或第三数据。Step 320: The first operating circuit outputs a first result of multiplying the first target data and the second target data according to the first target data and/or the second target data as data in the first set, and the first target data is multiplied by the second target data. A result includes: 0 or third data.

其中，该第三数据是根据n对第一目标数据或第二目标数据移位得到的，或是根据n对第一目标数据或第二目标数据移位并取反得到的。Wherein, the third data is obtained by shifting n pairs of first target data or second target data, or by shifting and inverting n pairs of first target data or second target data.

可选地，该第一操作电路根据该第一目标数据为0，输出的该第一结果为0。Optionally, the first operation circuit outputs the first result as 0 according to the first target data as 0.

可选地，该第一操作电路根据该第一目标数据为2ⁿ，输出的该第一结果为该第三数据，该第三数据是对该第二目标数据左移或右移|n|位得到的。Optionally, according to the first target data being 2 ⁿ , the first operation circuit outputs the first result as the third data, and the third data is shifted left or right by |n| of the second target data bit got.

可选地，该第一操作电路根据该第一目标数据为2ⁿ以及n为正整数，输出的该第一结果为该第三数据，该第三数据是对该第二目标数据左移n位得到的；或该第一操作电路根据该第一目标数据为2ⁿ以及该n为负整数，输出的该第一结果为该第三数据，该第三数据为对该第二目标数据右移|n|位得到的。Optionally, according to the first target data being 2 ⁿ and n being a positive integer, the first result output by the first operation circuit is the third data, and the third data is a left shift of the second target data by n bit obtained; or the first operation circuit according to the first target data is 2 ⁿ and the n is a negative integer, the output of the first result is the third data, the third data is the right of the second target data obtained by shifting |n| bits.

可选地，该第一操作电路根据第一目标数据为-2ⁿ，输出的第一结果为所述第三数据，第三数据是对第二目标数据左移或右移|n|位并取反得到的。Optionally, according to the first target data being -2 ⁿ , the first output result of the first operation circuit is the third data, and the third data is left-shifted or right-shifted by |n| bits and Get the negation.

可选地，该方法还包括：该比较电路根据该第一目标数据和/或该第二目标数据为该第一集合中的数据，确定第一操作码，其中，该第一操作码的取值指示该第一目标数据和/或该第二目标数据为0或±2ⁿ；该第一操作电路根据该第一操作码的取值以及该第一目标数据和/或该第二目标数据，确定该第一结果为0或该第三数据。Optionally, the method further includes: the comparison circuit determines a first operation code according to the first target data and/or the second target data as data in the first set, wherein the first operation code is The value indicates that the first target data and/or the second target data is 0 or ±2 ⁿ ; the first operation circuit is based on the value of the first operation code and the first target data and/or the second target data , determine that the first result is 0 or the third data.

可选地，该方法还包括：至少一个第一寄存器从该比较电路获取该第一目标数据和该第二目标数据，该至少一个第一寄存器分别与该比较电路、该第一操作电路连接；该至少一个第一寄存器将该第一目标数据和该第二目标数据输出至该第一操作电路。Optionally, the method further includes: at least one first register acquires the first target data and the second target data from the comparison circuit, and the at least one first register is respectively connected to the comparison circuit and the first operation circuit; The at least one first register outputs the first target data and the second target data to the first operating circuit.

可选地，该方法还包括：该至少一个第一寄存器从该比较电路获取该第一操作码，并将该第一操作码输出至该第一操作电路。Optionally, the method further includes: the at least one first register acquires the first operation code from the comparison circuit, and outputs the first operation code to the first operation circuit.

可选地，该方法还包括：至少一个第二寄存器在该比较电路确定该第一目标数据和该第二目标数据均不是该第一集合中的数据的情况下，将获取的该第一目标数据和该第二目标数据输出至第二操作电路，该至少一个第二寄存器和该第二操作电路连接；该第二操作电路对接收到的该第一目标数据和该第二目标数据进行常规的乘法运算，输出该第一目标数据和该第二目标数据相乘的第二结果。Optionally, the method further includes: at least one second register, when the comparison circuit determines that neither the first target data nor the second target data is data in the first set, the acquired first target The data and the second target data are output to a second operating circuit, the at least one second register is connected to the second operating circuit; the second operating circuit performs normal operation on the received first target data and the second target data multiplication operation, outputting a second result of multiplying the first target data and the second target data.

可选地，该方法还包括：数据选择器MUX将该第一结果或该第二结果作为该矩阵乘法器的输出，该UX分别与该第二操作电路、该第一操作电路连接。Optionally, the method further includes: a data selector MUX takes the first result or the second result as an output of the matrix multiplier, and the UX is respectively connected to the second operation circuit and the first operation circuit.

可选地，该方法还包括：该比较电路在确定该第一目标数据和/或该第二目标数据为该第一集合中的数据的情况下，向该至少一个第一寄存器输出取值为1的使能信号；该至少一个第一寄存器根据该取值为1的使能信号，将该第一目标数据和该第二目标数据输出至该第一操作电路。Optionally, the method further includes: when the comparison circuit determines that the first target data and/or the second target data are data in the first set, outputting a value of An enable signal of 1; the at least one first register outputs the first target data and the second target data to the first operating circuit according to the enable signal with a value of 1.

可选地，该方法还包括：该比较电路在确定该第一目标数据和该第二目标数据均不是该第一集合中的数据的情况下，向该至少一个第一寄存器输出取值为0的使能信号；取反电路对该比较电路输出的该取值为0的使能信号进行取反操作，得到取值为1的使能信号，并将该取值为1的使能信号输出至该至少一个第二寄存器，该取反电路分别与该比较电路和该至少一个第二寄存器连接；该至少一个第二寄存器基于该取值为1的使能信号，将获取的该第一目标数据和该第二目标数据输出至该第二操作电路。Optionally, the method further includes: when the comparison circuit determines that neither the first target data nor the second target data is data in the first set, outputting a value of 0 to the at least one first register the enable signal; the inversion circuit performs an inversion operation on the enable signal with a value of 0 output by the comparison circuit to obtain an enable signal with a value of 1, and outputs the enable signal with a value of 1 To the at least one second register, the inverting circuit is respectively connected to the comparison circuit and the at least one second register; the at least one second register is based on the enable signal with a value of 1, and the first target to be acquired The data and the second target data are output to the second operation circuit.

可选地，该取反电路为非门。Optionally, the negation circuit is a NOT gate.

可选地，该MUX在接收到该取值为1的使能信号后，将该第一结果作为该矩阵乘法器的输出；或该MUX在接收到该取值为0的使能信号后，将该第二结果作为该矩阵乘法器的输出。Optionally, after the MUX receives the enable signal with a value of 1, the first result is used as the output of the matrix multiplier; or after the MUX receives the enable signal with a value of 0, The second result is used as the output of the matrix multiplier.

应理解，图3所示的方法实施例的描述与图1或图2所示的矩阵乘法器的描述相互对应，因此，未详细描述的部分可以参见前面矩阵乘法器的实施例。It should be understood that the description of the method embodiment shown in FIG. 3 corresponds to the description of the matrix multiplier shown in FIG. 1 or FIG. 2 . Therefore, for parts not described in detail, reference may be made to the foregoing embodiments of the matrix multiplier.

上文结合图3，详细描述了本申请实施例提供的方法，下面将结合图4-图5，详细描述本申请装置的实施例。应理解，方法实施例的描述与装置实施例的描述相互对应，因此，未详细描述的部分可以参见前面方法实施例。The method provided by the embodiment of the present application is described in detail above with reference to FIG. 3 , and the embodiment of the device of the present application will be described in detail below in conjunction with FIGS. 4-5 . It should be understood that the descriptions of the method embodiments correspond to the descriptions of the device embodiments, therefore, for parts not described in detail, reference may be made to the foregoing method embodiments.

图4是本申请实施例提供的一种矩阵相乘的装置400的示意性框图。该装置400可以通过软件、硬件或者两者的结合实现。本申请实施例提供的装置400可以实现本申请实施例图3所示的方法流程，该装置400包括：比较模块410，确定模块420，输出模块430，其中，比较模块410用于比较电路确定第一目标数据和/或第二目标数据是否为第一集合中的数据，其中，该第一目标数据为该第一矩阵中的数据，该第二目标数据为该第二矩阵中的数据，该第一集合包括：0、±2ⁿ，该n为整数；确定模块420用于第一操作电路根据该第一目标数据和/或第二目标数据为该第一集合中的数据，确定该第一目标数据和该第二目标数据相乘的第一结果，该第一结果包括：0或第三数据，其中，第三数据是根据n对第一目标数据或第二目标数据移位得到的，或是根据n对第一目标数据或第二目标数据移位并取反得到的；输出模块430用于输出第一结果。FIG. 4 is a schematic block diagram of an apparatus 400 for matrix multiplication provided by an embodiment of the present application. The device 400 can be implemented by software, hardware or a combination of both. The device 400 provided in the embodiment of the present application can realize the method flow shown in FIG. Whether the first target data and/or the second target data are the data in the first set, wherein the first target data is the data in the first matrix, the second target data is the data in the second matrix, the The first set includes: 0, ± ²ⁿ , where n is an integer; the determination module 420 is used for the first operation circuit to determine the first target data and/or the second target data as the data in the first set. The first result of multiplying the first target data and the second target data, the first result includes: 0 or third data, wherein the third data is obtained by shifting the first target data or the second target data according to n pairs , or obtained by shifting and inverting the first target data or the second target data according to n; the output module 430 is configured to output the first result.

可选地，确定模块420具体用于：第一操作电路根据该第一目标数据为0，确定该第一结果为0。Optionally, the determining module 420 is specifically configured to: the first operating circuit determines that the first result is 0 according to the first target data being 0.

可选地，确定模块420具体用于：第一操作电路根据该第一目标数据为2ⁿ，确定该第一结果为该第三数据，该第三数据是对该第二目标数据左移或右移|n|位得到的。Optionally, the determining module 420 is specifically configured to: the first operation circuit determines that the first result is the third data according to the first target data being 2 ⁿ , and the third data is left-shifted or Obtained by right shifting |n| bits.

可选地，确定模块420具体用于：第一操作电路根据该第一目标数据为2ⁿ以及该n为正整数，确定该第一结果为该第三数据，该第三数据是对该第二目标数据左移n位得到的；或该第一操作电路根据该第一目标数据为2ⁿ以及该n为负整数，输出的该第一结果为该第三数据，该第三数据为对该第二目标数据右移|n|位得到的。Optionally, the determination module 420 is specifically configured to: the first operating circuit determines that the first result is the third data according to the first target data being 2 ⁿ and the n is a positive integer, and the third data is the third data for the first target data. The second target data is shifted to the left by n bits; or the first operation circuit outputs the first result according to the first target data being 2 ⁿ and the n being a negative integer, which is the third data, and the third data is the pair The second object data is shifted right by |n| bits.

可选地，确定模块420具体用于：第一操作电路根据该第一目标数据为-2ⁿ，确定该第一结果为该第三数据，该第三数据是对该第二目标数据左移或右移|n|位并取反得到得到的。Optionally, the determining module 420 is specifically configured to: the first operation circuit determines that the first result is the third data according to the first target data being -2 ⁿ , and the third data is a left shift of the second target data Or shift right |n| bits and invert to get the resulting.

可选地，确定模块420还用于：比较电路根据该第一目标数据和/或该第二目标数据为该第一集合中的数据，确定第一操作码，其中，该第一操作码的取值指示该第一目标数据和/或该第二目标数据为0或±2ⁿ；第一操作电路根据该第一操作码的取值以及该第一目标数据和/或该第二目标数据，确定该第一结果为0或该第三数据。Optionally, the determination module 420 is further configured to: the comparison circuit determines a first operation code according to the first target data and/or the second target data as data in the first set, wherein the first operation code The value indicates that the first target data and/or the second target data is 0 or ± ²ⁿ ; the first operation circuit according to the value of the first operation code and the first target data and/or the second target data , determine that the first result is 0 or the third data.

可选地，装置400还包括：获取模块，用于至少一个第一寄存器从该比较电路获取该第一目标数据和该第二目标数据，该至少一个第一寄存器分别与该比较电路、该第一操作电路连接；输出模块430还用于该至少一个第一寄存器将该第一目标数据和该第二目标数据输出至该第一操作电路。Optionally, the device 400 further includes: an acquisition module, configured to acquire the first target data and the second target data from the comparison circuit by at least one first register, the at least one first register is respectively connected to the comparison circuit and the second target data An operation circuit is connected; the output module 430 is also used for the at least one first register to output the first target data and the second target data to the first operation circuit.

可选地，获取模块，还用于该至少一个第一寄存器从该比较电路获取该第一操作码；输出模块430还用于该至少一个第一寄存器将该第一操作码输出至该第一操作电路。Optionally, the obtaining module is also used for the at least one first register to obtain the first operation code from the comparison circuit; the output module 430 is also used for the at least one first register to output the first operation code to the first operating circuit.

可选地，装置400还包括：乘法模块，输出模块430还用于至少一个第二寄存器在该比较电路确定该第一目标数据和该第二目标数据均不是该第一集合中的数据的情况下，将获取的该第一目标数据和该第二目标数据输出至第二操作电路，该至少一个第二寄存器和该第二操作电路连接；乘法模块用于该第二操作电路对接收到的该第一目标数据和该第二目标数据进行常规的乘法运算，得到该第一目标数据和该第二目标数据相乘的第二结果；输出模块430还用于该第二操作电路输出该第一目标数据和该第二目标数据相乘的第二结果。Optionally, the device 400 further includes: a multiplication module, and the output module 430 is also used for at least one second register when the comparison circuit determines that neither the first target data nor the second target data is data in the first set Next, output the obtained first target data and the second target data to the second operation circuit, the at least one second register is connected to the second operation circuit; the multiplication module is used for the second operation circuit to receive the The first target data and the second target data perform a conventional multiplication operation to obtain a second result of multiplying the first target data and the second target data; the output module 430 is also used for the second operation circuit to output the first target data A second result of multiplying the first target data and the second target data.

可选地，输出模块430还用于数据选择器MUX将该第一结果或该第二结果作为该矩阵乘法器的输出，该UX分别与该第二操作电路、该第一操作电路连接。Optionally, the output module 430 is also used for the data selector MUX to use the first result or the second result as an output of the matrix multiplier, and the UX is respectively connected to the second operation circuit and the first operation circuit.

可选地，输出模块430还用于比较电路在确定该第一目标数据和/或该第二目标数据为该第一集合中的数据的情况下，向该至少一个第一寄存器输出取值为1的使能信号；该至少一个第一寄存器根据该取值为1的使能信号，将该第一目标数据和该第二目标数据输出至该第一操作电路。Optionally, the output module 430 is further configured to output a value of An enable signal of 1; the at least one first register outputs the first target data and the second target data to the first operating circuit according to the enable signal with a value of 1.

可选地，装置400还包括：取反模块，输出模块430还用于比较电路在确定该第一目标数据和该第二目标数据均不是该第一集合中的数据的情况下，向该至少一个第一寄存器输出取值为0的使能信号；取反模块用于取反电路对该比较电路输出的该取值为0的使能信号进行取反操作，得到取值为1的使能信号，该取反电路分别与该比较电路和该至少一个第二寄存器连接；输出模块430还用于取反模块将该取值为1的使能信号输出至该至少一个第二寄存器；输出模块430还用于该至少一个第二寄存器基于该取值为1的使能信号，将获取的该第一目标数据和该第二目标数据输出至该第二操作电路。Optionally, the device 400 further includes: an inversion module, and the output module 430 is also used for the comparison circuit to send at least A first register outputs an enable signal with a value of 0; the inversion module is used for the inversion circuit to perform an inversion operation on the enable signal with a value of 0 output by the comparison circuit to obtain an enable with a value of 1 signal, the inversion circuit is respectively connected to the comparison circuit and the at least one second register; the output module 430 is also used for the inversion module to output the enable signal with a value of 1 to the at least one second register; the output module 430 is further used for the at least one second register to output the acquired first target data and the second target data to the second operating circuit based on the enable signal with a value of 1.

可选地，输出模块430还用于该MUX在接收到该取值为1的使能信号后，将该第一结果作为该矩阵乘法器的输出；或该MUX在接收到该取值为0的使能信号后，将该第二结果作为该矩阵乘法器的输出。Optionally, the output module 430 is also used for the MUX to use the first result as the output of the matrix multiplier after receiving the enable signal with a value of 1; or the MUX receives the value of 0 After the enable signal of , the second result is used as the output of the matrix multiplier.

这里的装置400可以以功能模块的形式体现。这里的术语“模块”可以通过软件和/或硬件形式实现，对此不作具体限定。The apparatus 400 here may be embodied in the form of functional modules. The term "module" here may be implemented in the form of software and/or hardware, which is not specifically limited.

在本申请的实施例中描述的各示例的模块，能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本申请的范围。The modules of each example described in the embodiments of the present application can be realized by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.

需要说明的是：上述实施例提供的装置在执行上述方法时，仅以上述各功能模块的划分进行举例说明，实际应用中，可以根据需要而将上述功能分配由不同的功能模块完成，即将装置的内部结构划分成不同的功能模块，以完成以上描述的全部或者部分功能。例如，比较模块410可以用于执行上述方法中的任意步骤，确定模块420可以用于执行上述方法中的任意步骤，输出模块430可以用于执行上述方法中的任意步骤。比较模块410、确定模块420、输出模块430负责实现的步骤可根据需要指定，通过比较模块410、确定模块420、输出模块430分别实现上述方法中不同的步骤来实现上述装置的全部功能。It should be noted that: when the device provided in the above embodiment executes the above method, the division of the above functional modules is used as an example for illustration. In practical applications, the above function distribution can be completed by different functional modules according to needs, that is, the device The internal structure of the system is divided into different functional modules to complete all or part of the functions described above. For example, the comparison module 410 may be used to perform any step in the above method, the determination module 420 may be used to perform any step in the above method, and the output module 430 may be used to perform any step in the above method. The steps that the comparison module 410, the determination module 420, and the output module 430 are responsible for can be specified as required, and the comparison module 410, the determination module 420, and the output module 430 respectively implement different steps in the above-mentioned method to realize all functions of the above-mentioned device.

另外，上述实施例提供的装置与方法实施例属于同一构思，其具体实现过程详见上文中的方法实施例，这里不再赘述。In addition, the device and the method embodiments provided in the above embodiments belong to the same idea, and the specific implementation process thereof is detailed in the method embodiments above, and will not be repeated here.

本申请实施例提供的方法可以由计算设备执行，该计算设备也可以被称为计算机系统。包括硬件层、运行在硬件层之上的操作系统层，以及运行在操作系统层上的应用层。该硬件层包括处理单元、内存和内存控制单元等硬件，随后对该硬件的功能和结构进行详细说明。该操作系统是任意一种或多种通过进程（process）实现业务处理的计算机操作系统，例如，Linux操作系统、Unix操作系统、Android操作系统、iOS操作系统或windows操作系统等。该应用层包含浏览器、通讯录、文字处理软件、即时通信软件等应用程序。并且，可选地，该计算机系统是智能手机等手持设备，或个人计算机等终端设备，本申请并未特别限定，只要能够通过本申请实施例提供的方法即可。本申请实施例提供的方法的执行主体可以是计算设备，或者，是计算设备中能够调用程序并执行程序的功能模块。The method provided in the embodiment of the present application may be executed by a computing device, and the computing device may also be called a computer system. Including the hardware layer, the operating system layer running on the hardware layer, and the application layer running on the operating system layer. The hardware layer includes hardware such as processing unit, memory and memory control unit, and then the function and structure of the hardware will be described in detail. The operating system is any one or more computer operating systems that implement business processing through processes, for example, Linux operating system, Unix operating system, Android operating system, iOS operating system, or windows operating system. The application layer includes browsers, address books, word processing software, instant messaging software and other applications. Moreover, optionally, the computer system is a handheld device such as a smart phone, or a terminal device such as a personal computer, which is not particularly limited in this application, as long as it can pass the method provided by the embodiment of this application. The execution subject of the method provided by the embodiment of the present application may be a computing device, or a functional module in the computing device capable of invoking a program and executing the program.

下面结合图5，对本申请实施例提供的一种计算设备进行详细描述。A computing device provided by an embodiment of the present application will be described in detail below with reference to FIG. 5 .

图5是本申请实施例提供的一种计算设备1500的架构示意图。该计算设备1500可以是服务器或者计算机或者其他具有计算能力的设备。图5所示的计算设备1500包括：至少一个处理器1510和存储器1520。FIG. 5 is a schematic structural diagram of a computing device 1500 provided by an embodiment of the present application. The computing device 1500 may be a server or a computer or other devices with computing capabilities. The computing device 1500 shown in FIG. 5 includes: at least one processor 1510 and a memory 1520 .

应理解，本申请不限定计算设备1500中的处理器、存储器的个数。It should be understood that the present application does not limit the number of processors and memories in the computing device 1500 .

处理器1510执行存储器1520中的指令，使得计算设备1500实现本申请提供的方法。或者，处理器1510执行存储器1520中的指令，使得计算设备1500实现本申请提供的各功能模块，从而实现本申请提供的方法。The processor 1510 executes the instructions in the memory 1520, so that the computing device 1500 implements the method provided in this application. Alternatively, the processor 1510 executes the instructions in the memory 1520, so that the computing device 1500 implements each functional module provided in this application, thereby realizing the method provided in this application.

可选地，计算设备1500还包括通信接口1530。通信接口1530使用例如但不限于网络接口卡、收发器一类的收发模块，来实现计算设备1500与其他设备或通信网络之间的通信。Optionally, the computing device 1500 also includes a communication interface 1530 . The communication interface 1530 implements communication between the computing device 1500 and other devices or communication networks by using transceiver modules such as but not limited to network interface cards and transceivers.

可选地，计算设备1500还包括系统总线1540，其中，处理器1510、存储器1520和通信接口1530分别与系统总线1540连接。处理器1510能够通过系统总线1540访问存储器1520，例如，处理器1510能够通过系统总线1540在存储器1520中进行数据读写或代码执行。该系统总线1540是快捷外设部件互连标准（peripheral component interconnectexpress，PCI）总线或扩展工业标准结构（extended industry standard architecture，EISA）总线等。所述系统总线1540分为地址总线、数据总线、控制总线等。为便于表示，图5中仅用一条粗线表示，但并不表示仅有一根总线或一种类型的总线。Optionally, the computing device 1500 further includes a system bus 1540 , wherein the processor 1510 , the memory 1520 and the communication interface 1530 are respectively connected to the system bus 1540 . The processor 1510 can access the memory 1520 through the system bus 1540 , for example, the processor 1510 can read and write data or execute codes in the memory 1520 through the system bus 1540 . The system bus 1540 is a peripheral component interconnect express (PCI) bus or an extended industry standard architecture (EISA) bus or the like. The system bus 1540 is divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in FIG. 5 , but it does not mean that there is only one bus or one type of bus.

一种可能的实现方式，处理器1510的功能主要是解释计算机程序的指令（或者说，代码）以及处理计算机软件中的数据。其中，该计算机程序的指令以及计算机软件中的数据能够保存在存储器1520或者缓存1516中。In a possible implementation manner, the function of the processor 1510 is mainly to interpret instructions (or codes) of computer programs and process data in computer software. Wherein, the instructions of the computer program and the data in the computer software can be stored in the memory 1520 or the cache 1516 .

可选地，处理器1510可能是集成电路芯片，具有信号的处理能力。作为示例而非限定，处理器1510是通用处理器、数字信号处理器（digital signal processor，DSP）、专用集成电路（application specific integrated circuit，ASIC）、现成可编程门阵列（fieldprogrammable gate array，FPGA）或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。其中，通用处理器是微处理器等。例如，该处理器1510是中央处理单元（central processing unit，CPU）。Optionally, the processor 1510 may be an integrated circuit chip, which has a signal processing capability. By way of example and not limitation, processor 1510 is a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (field programmable gate array, FPGA) Or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. Among them, the general-purpose processor is a microprocessor or the like. For example, the processor 1510 is a central processing unit (central processing unit, CPU).

可选地，每个处理器1510包括至少一个处理单元1512和内存控制单元1514。Optionally, each processor 1510 includes at least one processing unit 1512 and a memory control unit 1514 .

可选地，处理单元1512也称为核心（core）或内核，是处理器最重要的组成部分。处理单元1512是由单晶硅以一定的生产工艺制造出来的，处理器所有的计算、接受命令、存储命令、处理数据都由核心执行。处理单元分别独立地运行程序指令，利用并行计算的能力加快程序的运行速度。各种处理单元都具有固定的逻辑结构，例如，处理单元包括例如，一级缓存、二级缓存、执行单元、指令级单元和总线接口等逻辑单元。Optionally, the processing unit 1512 is also called a core (core) or core, and is the most important component of the processor. The processing unit 1512 is manufactured by a certain production process of single crystal silicon, and all calculations, receiving commands, storing commands, and processing data of the processor are executed by the core. The processing units run the program instructions independently, and use the ability of parallel computing to speed up the running speed of the program. Various processing units have a fixed logical structure. For example, a processing unit includes logical units such as a first-level cache, a second-level cache, an execution unit, an instruction-level unit, and a bus interface.

一种实现举例，内存控制单元1514用于控制存储器1520与处理单元1512之间的数据交互。具体地说，内存控制单元1514从处理单元1512接收内存访问请求，并基于该内存访问请求控制针对内存的访问。作为示例而非限定，内存控制单元是内存管理单元（memorymanagement unit，MMU）等器件。In one implementation example, the memory control unit 1514 is configured to control data interaction between the memory 1520 and the processing unit 1512 . Specifically, the memory control unit 1514 receives a memory access request from the processing unit 1512, and controls access to memory based on the memory access request. As an example but not a limitation, the memory control unit is a device such as a memory management unit (memory management unit, MMU).

一种实现举例，各内存控制单元1514通过系统总线进行针对存储器1520的寻址。并且在系统总线中配置仲裁器（图5中未示出），该仲裁器负责处理和协调多个处理单元1512的竞争访问。In an implementation example, each memory control unit 1514 addresses the memory 1520 through a system bus. And an arbiter (not shown in FIG. 5 ) is configured in the system bus, and the arbiter is responsible for processing and coordinating competing accesses of multiple processing units 1512 .

一种实现举例，处理单元1512和内存控制单元1514通过芯片内部的连接线，例如地址线，通信连接，从而实现处理单元1512和内存控制单元1514之间的通信。In one implementation example, the processing unit 1512 and the memory control unit 1514 are communicatively connected through a connection line inside the chip, such as an address line, so as to realize communication between the processing unit 1512 and the memory control unit 1514 .

可选地，每个处理器1510还包括缓存1516，其中，缓存是数据交换的缓冲区（称作cache）。当处理单元1512要读取数据时，会首先从缓存中查找需要的数据，如果找到了则直接执行，找不到的话则从存储器中找。由于缓存的运行速度比存储器快得多，故缓存的作用就是帮助处理单元1512更快地运行。Optionally, each processor 1510 also includes a cache 1516, where a cache is a buffer (called cache) for data exchange. When the processing unit 1512 wants to read data, it will first search for the required data from the cache, if it finds it, it will execute it directly, if it cannot find it, it will find it from the memory. Since the cache runs much faster than the memory, the role of the cache is to help the processing unit 1512 run faster.

存储器1520能够为计算设备1500中的进程提供运行空间，例如，存储器1520中保存用于生成进程的计算机程序（具体地说，是程序的代码）。计算机程序被处理器运行而生成进程后，处理器在存储器1520中为该进程分配对应的存储空间。进一步的，上述存储空间进一步包括文本段、初始化数据段、位初始化数据段、栈段、堆段等等。存储器1520在上述进程对应的存储空间中保存进程运行期间产生的数据，例如，中间数据，或过程数据等等。The memory 1520 can provide running space for the processes in the computing device 1500 , for example, the memory 1520 stores computer programs (specifically, program codes) for generating processes. After the computer program is run by the processor to generate a process, the processor allocates a corresponding storage space for the process in the memory 1520 . Further, the above storage space further includes a text segment, an initialization data segment, a bit initialization data segment, a stack segment, a heap segment, and the like. The memory 1520 stores data generated during the running of the process, such as intermediate data or process data, in the storage space corresponding to the above process.

可选地，存储器也称为内存，其作用是用于暂时存放处理器1510中的运算数据，以及与硬盘等外部存储器交换的数据。只要计算机在运行中，处理器1510就会把需要运算的数据调到内存中进行运算，当运算完成后处理单元1512再将结果传送出来。Optionally, the storage is also referred to as memory, and its function is to temporarily store calculation data in the processor 1510 and exchange data with external storage such as a hard disk. As long as the computer is running, the processor 1510 will transfer the data to be calculated into the memory for calculation, and the processing unit 1512 will transmit the result after the calculation is completed.

作为示例而非限定，存储器1520是易失性存储器或非易失性存储器，或可包括易失性和非易失性存储器两者。其中，非易失性存储器是只读存储器（read-only memory，ROM）、可编程只读存储器（programmable ROM，PROM）、可擦除可编程只读存储器（erasablePROM，EPROM）、电可擦除可编程只读存储器（electrically EPROM，EEPROM）或闪存。易失性存储器是随机存取存储器（random access memory，RAM），其用作外部高速缓存。通过示例性但不是限制性说明，许多形式的RAM可用，例如静态随机存取存储器（static RAM，SRAM）、动态随机存取存储器（dynamic RAM，DRAM）、同步动态随机存取存储器（synchronous DRAM，SDRAM）、双倍数据速率同步动态随机存取存储器（double data rate SDRAM，DDR SDRAM）、增强型同步动态随机存取存储器（enhanced SDRAM，ESDRAM）、同步连接动态随机存取存储器（synchlink DRAM，SLDRAM）和直接内存总线随机存取存储器（direct rambus RAM，DRRAM）。应注意，本文描述的系统和方法的存储器1520旨在包括但不限于这些和任意其它适合类型的存储器。By way of example and not limitation, memory 1520 is either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. Among them, non-volatile memory is read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasablePROM, EPROM), electrically erasable Programmable read-only memory (electrically EPROM, EEPROM) or flash memory. Volatile memory is random access memory (RAM), which acts as external cache memory. By way of illustration and not limitation, many forms of RAM are available such as static random access memory (static RAM, SRAM), dynamic random access memory (dynamic RAM, DRAM), synchronous dynamic random access memory (synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (double data rate SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), synchronous connection dynamic random access memory (synchlink DRAM, SLDRAM ) and direct memory bus random access memory (direct rambus RAM, DRRAM). It should be noted that memory 1520 of the systems and methods described herein is intended to include, but is not limited to, these and any other suitable types of memory.

以上列举的计算设备1500的结构仅为示例性说明，本申请并未限定于此，本申请实施例的计算设备1500包括现有技术中计算机系统中的各种硬件，例如，计算设备1500还包括除存储器1520以外的其他存储器，例如，磁盘存储器等。本领域的技术人员应当理解，计算设备1500还可以包括实现正常运行所必须的其他器件。同时，根据具体需要，本领域的技术人员应当理解，上述计算设备1500还可包括实现其他附加功能的硬件器件。此外，本领域的技术人员应当理解，上述计算设备1500也可仅仅包括实现本申请实施例所必须的器件，而不必包括图5中所示的全部器件。The structure of the computing device 1500 listed above is only an example, and the present application is not limited thereto. The computing device 1500 in the embodiment of the present application includes various hardware in the computer system in the prior art. For example, the computing device 1500 also includes Other storages other than the storage 1520, for example, disk storage and the like. Those skilled in the art should understand that the computing device 1500 may also include other devices necessary for normal operation. Meanwhile, according to specific needs, those skilled in the art should understand that the above computing device 1500 may also include hardware devices for implementing other additional functions. In addition, those skilled in the art should understand that the above-mentioned computing device 1500 may only include components necessary to realize the embodiment of the present application, and does not necessarily include all the components shown in FIG. 5 .

本实施例中，还提供了一种包含上述矩阵乘法器的芯片。In this embodiment, a chip including the above-mentioned matrix multiplier is also provided.

本实施例中，还提供了一种包含指令的计算机程序产品，所述计算机程序产品可以是包含指令的，能够运行在计算设备上或被储存在任何可用介质中的软件或程序产品。当其在计算设备上运行时，使得计算设备执行上述所提供的方法，或者使得该计算设备实现上述提供的装置的功能。In this embodiment, a computer program product containing instructions is also provided. The computer program product may be a software or program product containing instructions that can run on a computing device or be stored in any available medium. When it runs on the computing device, it makes the computing device execute the method provided above, or makes the computing device realize the function of the device provided above.

本实施例中，还提供了一种计算机可读存储介质，计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质，（例如，软盘、硬盘、磁带）、光介质（例如，DVD）、或者半导体介质（例如固态硬盘）等。该计算机可读存储介质包括指令，当计算机可读存储介质中的指令在计算设备上被执行时，使得计算设备执行上述所提供的方法。In this embodiment, a computer-readable storage medium is also provided. The computer-readable storage medium may be any available medium capable of being stored by a computing device or a data storage device such as a data center including one or more available media. The available medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid-state hard disk), and the like. The computer-readable storage medium includes instructions, and when the instructions in the computer-readable storage medium are executed on the computing device, the computing device is made to execute the method provided above.

应理解，在本申请的各种实施例中，上述各过程的序号的大小并不意味着执行顺序的先后，各过程的执行顺序应以其功能和内在逻辑确定，而不应对本申请实施例的实施过程构成任何限定。It should be understood that, in various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the order of execution, and the execution order of the processes should be determined by their functions and internal logic, and should not be used in the embodiments of the present application. The implementation process constitutes any limitation.

本领域普通技术人员可以意识到，结合本文中所公开的实施例描述的各示例的单元及算法步骤，能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本申请的范围。Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的系统、装置和单元的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.

在本申请所提供的几个实施例中，应该理解到，所揭露的系统、装置和方法，可以通过其它的方式实现。例如，以上所描述的装置实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外，在本申请各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备（可以是个人计算机，服务器，或者网络设备等）执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器（read-only memory，ROM）、随机存取存储器（random access memory，RAM）、磁碟或者光盘等各种可以存储程序代码的介质。If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program codes. .

以上所述，仅为本申请的具体实施方式，但本申请的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本申请揭露的技术范围内，可轻易想到变化或替换，都应涵盖在本申请的保护范围之内。因此，本申请的保护范围应以所述权利要求的保护范围为准。The above is only a specific implementation of the application, but the scope of protection of the application is not limited thereto. Anyone familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the application. Should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be determined by the protection scope of the claims.

Claims

1. A matrix multiplier, characterized in that, said matrix multiplier comprises:

A comparison circuit, configured to determine whether the first target data and/or the second target data are data in the first set, wherein the first target data is data in the first matrix, and the second target data is the data in the first matrix The data in the second matrix, the first set includes: 0, ±2 ⁿ , the n is an integer;

A first operation circuit, configured to output the result of multiplying the first target data and the second target data according to the first target data and/or the second target data being data in the first set The first result, the first result includes: 0 or third data, wherein the third data is obtained by shifting the first target data or the second target data according to the n, or the The third data is obtained by shifting and inverting the first target data or the second target data according to the n.

2. The matrix multiplier according to claim 1, wherein

The first operation circuit is specifically configured to output the first result as 0 according to the first target data being 0.

3. The matrix multiplier according to claim 1, wherein

The first operation circuit is specifically configured to output the first result as the third data according to the first target data being 2 ⁿ , and the third data is a left shift of the second target data Or right shift |n| bits to get.

4. The matrix multiplier according to claim 3, wherein the first operating circuit is specifically used for:

According to the first target data being 2 ⁿ and the n being a positive integer, the outputted first result is the third data, and the third data is obtained by shifting the second target data to the left by n bits of; or

According to the first target data being 2 ⁿ and the n being a negative integer, the outputted first result is the third data, and the third data is a right shift of the second target data by |n| bit got.

5. The matrix multiplier according to claim 1, wherein

The first operation circuit is specifically configured to output the first result as the third data according to the first target data being -2 ⁿ , and the third data is the left and right of the second target data shift or right |n| bits and negate.

6. The matrix multiplier according to any one of claims 1 to 5, characterized in that,

The comparison circuit is further configured to determine a first operation code according to the first target data and/or the second target data as data in the first set, wherein the first operation code is a value indicating that said first target data and/or said second target data is 0 or ±2 ⁿ ;

The first operation circuit is specifically configured to determine that the first result is 0 or the third target data according to the value of the first operation code and the first target data and/or the second target data data.

7. The matrix multiplier according to any one of claims 1 to 5, characterized in that, the matrix multiplier also includes at least one first register, and the at least one first register is respectively connected to the comparison circuit, The first operating circuit is connected to,

said at least one first register for obtaining said first target data and said second target data from said comparison circuit;

The at least one first register is further configured to output the first target data and the second target data to the first operating circuit.

8. The matrix multiplier according to claim 7, wherein

The at least one first register is further used to obtain the first operation code from the comparison circuit, and output the first operation code to the first operation circuit.

9. The matrix multiplier according to any one of claims 1 to 5, characterized in that, the matrix multiplier also includes at least one second register and a second operating circuit, and the at least one second register and the The second operating circuit is connected,

The at least one second register is used to obtain the first target data and the second target data when the comparison circuit determines that neither the first target data nor the second target data is data in the first set. outputting the target data and the second target data to the second operation circuit;

The second operation circuit is configured to perform a conventional multiplication operation on the received first target data and the second target data, and output the first target data multiplied by the second target data Two results.

10. matrix multiplier according to claim 9, is characterized in that, described matrix multiplier also comprises data selector MUX, and described MUX is connected with described second operation circuit, described first operation circuit respectively,

The MUX is configured to use the first result or the second result as an output of the matrix multiplier.

11. The matrix multiplier according to claim 7, wherein

The comparison circuit is further configured to, when the comparison circuit determines that the first target data and/or the second target data are data in the first set, send to the at least one first register Output an enable signal with a value of 1;

The at least one first register is specifically configured to output the first target data and the second target data to the first operating circuit according to the enable signal with a value of 1.

12. The matrix multiplier according to claim 9, wherein the matrix multiplier also includes an inversion circuit, and the inversion circuit is respectively connected with the comparison circuit and the at least one second register,

The comparison circuit is further configured to output to the at least one first register when the comparison circuit determines that neither the first target data nor the second target data is data in the first set An enable signal whose value is 0;

The inversion circuit is used to invert the enable signal with a value of 0 output by the comparison circuit to obtain an enable signal with a value of 1, and convert the enable signal with a value of 1 to outputting an enable signal to the at least one second register;

The at least one second register is specifically configured to output the acquired first target data and the second target data to the second operating circuit based on the enable signal with a value of 1.

13. The matrix multiplier according to claim 12, wherein the negation circuit is a NOT gate.

14. The matrix multiplier according to claim 10, wherein the MUX is specifically used for:

After receiving an enable signal with a value of 1, using the first result as an output of the matrix multiplier; or

After receiving the enable signal whose value is 0, the second result is used as the output of the matrix multiplier.

15. A method for matrix multiplication, characterized in that the method is applied to a matrix multiplier, and the matrix multiplier is used to perform matrix multiplication operations on the first matrix and the second matrix, and the method comprises:

The comparison circuit determines whether the first target data is the data in the first matrix and/or the second target data is the data in the first set, wherein the first target data is the data in the first matrix, and the second target data is the data in the For the data in the second matrix, the first set includes: 0, ±2 ⁿ , where n is an integer;

The first operation circuit outputs a first result of multiplying the first target data and the second target data according to the first target data and/or the second target data being data in the first set, so The first result includes: 0 or third data, wherein the third data is obtained by shifting the first target data or the second target data according to the n, or the third data is It is obtained by shifting and inverting the first target data or the second target data according to the n.

16. The method according to claim 15, wherein the first operating circuit outputs the first target data and/or the second target data as data in the first set, and outputs the first The first result of multiplying the target data and the second target data includes:

The first operation circuit outputs the first result as 0 according to the first target data as 0.

17. The method according to claim 15, wherein the first operating circuit outputs the first target data and/or the second target data as data in the first set according to the first operation circuit The first result of multiplying the target data and the second target data includes:

According to the first target data being 2 ⁿ , the first operation circuit outputs the first result as the third data, and the third data is shifted left or right of the second target data | n| bits are obtained.

18. The method according to claim 17, wherein the first operation circuit outputs the first result as the third data according to the first target data being 2 ⁿ , comprising:

According to the first target data being 2 ⁿ and the n being a positive integer, the first operation circuit outputs the first result as the third data, and the third data is for the second target obtained by shifting the data to the left by n bits; or

According to the first target data being 2 ⁿ and the n being a negative integer, the first operation circuit outputs the first result as the third data, and the third data is a reference to the second target The data is shifted right by |n| bits.

19. The method according to claim 15, wherein the first operation circuit outputs the first target data and/or the second target data as data in the first set, and outputs the first The first result of multiplying the target data and the second target data includes:

According to the first target data being -2 ⁿ , the first operation circuit outputs the first result as the third data, and the third data is left-shifted or right-shifted for the second target data |n| bits and negate them.

20. The method according to any one of claims 15 to 19, further comprising:

The comparison circuit determines a first operation code according to the first target data and/or the second target data as data in the first set, wherein the value of the first operation code indicates the The first target data and/or the second target data are 0 or ±2 ⁿ ;

The first operation circuit outputs a first result of multiplying the first target data and the second target data according to the first target data and/or the second target data as data in the first set ,include:

The first operation circuit determines that the first result is 0 or the third data according to the value of the first operation code and the first target data and/or the second target data.

21. The method according to any one of claims 15 to 19, further comprising:

At least one first register acquires the first target data and the second target data from the comparison circuit, and the at least one first register is respectively connected to the comparison circuit and the first operation circuit;

The at least one first register outputs the first object data and the second object data to the first operation circuit.

22. The method of claim 21, further comprising:

The at least one first register obtains the first operation code from the comparison circuit and outputs the first operation code to the first operation circuit.

23. The method according to any one of claims 15 to 19, further comprising:

at least one second register, when the comparison circuit determines that neither the first target data nor the second target data is data in the first set, the acquired first target data and the The second target data is output to a second operating circuit, and the at least one second register is connected to the second operating circuit;

The second operation circuit performs a conventional multiplication operation on the received first target data and the second target data, and outputs a second result of multiplying the first target data and the second target data.

24. The method of claim 23, further comprising:

The data selector MUX takes the first result or the second result as the output of the matrix multiplier, and the UX is respectively connected to the second operation circuit and the first operation circuit.

25. The method of claim 21, further comprising:

When the comparison circuit determines that the first target data and/or the second target data are data in the first set, output an enable value of 1 to the at least one first register Signal;

The at least one first register outputs the first target data and the second target data to the first operating circuit, comprising:

The at least one first register outputs the first target data and the second target data to the first operating circuit according to the enable signal with a value of 1.

26. The method of claim 23, further comprising:

When the comparison circuit determines that neither the first target data nor the second target data is data in the first set, output an enable signal with a value of 0 to the at least one first register ;

The inversion circuit performs an inversion operation on the enable signal with a value of 0 output by the comparison circuit to obtain an enable signal with a value of 1, and outputs the enable signal with a value of 1 to the The at least one second register, the inversion circuit is respectively connected to the comparison circuit and the at least one second register;

When the at least one second register determines that neither the first target data nor the second target data is data in the first set, the acquired first target data and The second target data is output to a second operating circuit, including:

The at least one second register outputs the obtained first target data and the second target data to the second operating circuit based on the enable signal with a value of 1.

27. The method according to claim 26, wherein the negation circuit is a NOT gate.

28. The method according to claim 24, wherein the data selector MUX uses the first result or the second result as the output of the matrix multiplier, comprising:

After the MUX receives the enable signal whose value is 1, the first result is used as the output of the matrix multiplier; or

The MUX uses the second result as an output of the matrix multiplier after receiving the enable signal whose value is 0.

29. A chip, characterized in that the chip comprises the matrix multiplier according to any one of claims 1 to 14.

30. A computing device characterized by,

at least one processor; and

at least one memory coupled to said at least one processor;

Wherein, instructions executable by the at least one processor are stored in the at least one memory, and the at least one processor is used to execute the instructions stored in the at least one memory, so that the computing device performs the The method described in any one of 15 to 28.

31. A computer program product comprising instructions, characterized in that, when the instructions are executed by the computer, the computer is caused to perform the method according to any one of claims 15 to 28.

32. A computer-readable storage medium, characterized in that it includes computer program instructions, and when the computer program instructions are executed by the computer, the computer executes the computer program according to any one of claims 15 to 28. method.