CN118192934A

CN118192934A - Modular multiplication operation method, device, chip, board and vehicle-mounted system

Info

Publication number: CN118192934A
Application number: CN202410395880.0A
Authority: CN
Inventors: 顾瑞红; 施蕾; 王海军; 李晓玮; 王腾飞; 王冬梅
Original assignee: Ecarx Hubei Tech Co Ltd; Shanghai Jiao Tong University
Current assignee: Ecarx Hubei Tech Co Ltd; Shanghai Jiao Tong University
Priority date: 2024-04-02
Filing date: 2024-04-02
Publication date: 2024-06-14

Abstract

The application provides a modular multiplication operation method, a modular multiplication operation device, a chip, a board card and a vehicle-mounted system, and relates to the technical field of computers. The modular multiplication device comprises: a processor and a data operator; the data operator comprises a multiplier, an accumulator, a first register, a second register, a third register, a first multi-path data selector, a second multi-path data selector, a third multi-path data selector and a fourth multi-path data selector; the processor is used for acquiring target data matched with the data operation instruction after receiving the data operation instruction; transmitting the target data to the data arithmetic unit; the data arithmetic unit is used for receiving the target data and carrying out arithmetic processing on the target data to obtain a data arithmetic result; the data operation result is used for indicating a modular multiplication operation result. The method of the application improves the operation speed of the modular multiplication operation and improves the operation performance.

Description

Modular multiplication operation method, device, chip, board and vehicle-mounted system

技术领域Technical Field

本申请涉及计算机技术领域，尤其涉及一种模乘运算方法、装置、芯片、板卡和车载系统。The present application relates to the field of computer technology, and in particular to a modular multiplication method, device, chip, board and vehicle-mounted system.

背景技术Background technique

椭圆曲线密码(Elliptic Curve Cryptography，ECC)算法是一种基于椭圆曲线的公开密钥密码算法，该ECC算法由于具有更高的安全性和更小的密钥长度，在数字签名、信息安全、区块链等领域具有广泛的应用。The Elliptic Curve Cryptography (ECC) algorithm is a public key cryptographic algorithm based on elliptic curves. The ECC algorithm has a wide range of applications in digital signatures, information security, blockchain and other fields due to its higher security and smaller key length.

随着ECC算法的普及，ECC算法对应的计算数据量也在逐渐增大。此时，在进行ECC算法的模乘运算时，可能会由于逐渐增加的数据量，导致内存溢出，或者出现计算效率降低的现象，进而影响了ECC算法的性能。As the ECC algorithm becomes more popular, the amount of data that the ECC algorithm calculates is also increasing. At this time, when performing modular multiplication operations of the ECC algorithm, the increasing amount of data may cause memory overflow or reduced computing efficiency, thereby affecting the performance of the ECC algorithm.

发明内容Summary of the invention

本申请提供一种模乘运算方法、装置、芯片、板卡和车载系统，用以解决现有的模乘运算方法效率低、性能差的问题。The present application provides a modular multiplication operation method, device, chip, board and vehicle-mounted system to solve the problems of low efficiency and poor performance of existing modular multiplication operation methods.

第一方面，本申请提供一种模乘运算装置，包括：处理器和数据运算器；其中，所述数据运算器，包括乘法器、累加器、第一寄存器、第二寄存器、第三寄存器、第一多路数据选择器、第二多路数据选择器、第三多路数据选择器以及第四多路数据选择器；所述处理器分别与所述乘法器连接、所述第一多路数据选择器连接、所述第二多路数据选择器连接、所述第三多路数据选择器连接；所述第一多路数据选择器与所述第一寄存器连接；所述第一寄存器分别与所述第一多路数据选择器连接、所述第二多路数据选择器连接、所述第四多路数据选择器连接；所述第二多路数据选择器分别与所述第一寄存器连接、所述第二寄存器连接、所述处理器连接；所述第二寄存器分别与所述累加器连接、所述第二多路数据选择器连接；所述累加器分别与所述第三多路数据选择器连接、所述第四多路数据选择器连接；所述第三多路数据选择器与所述第三寄存器连接；所述第三寄存器分别与所述累加器连接、所述第四多路数据选择器连接；In a first aspect, the present application provides a modular multiplication operation device, comprising: a processor and a data operator; wherein the data operator comprises a multiplier, an accumulator, a first register, a second register, a third register, a first multiplexer, a second multiplexer, a third multiplexer and a fourth multiplexer; the processor is respectively connected to the multiplier, the first multiplexer, the second multiplexer and the third multiplexer; the first multiplexer is connected to the first register; the first register is respectively connected to the first multiplexer, the second multiplexer and the fourth multiplexer; the second multiplexer is respectively connected to the first register, the second register and the processor; the second register is respectively connected to the accumulator and the second multiplexer; the accumulator is respectively connected to the third multiplexer and the fourth multiplexer; the third multiplexer is connected to the third register; the third register is respectively connected to the accumulator and the fourth multiplexer;

所述处理器，用于接收到数据运算指令之后，获取与所述数据运算指令相匹配的目标数据；并将所述目标数据传输至所述数据运算器中；The processor is used to obtain target data matching the data operation instruction after receiving the data operation instruction; and transmit the target data to the data operator;

所述数据运算器，用于接收所述目标数据，并对所述目标数据进行运算处理，得到数据运算结果；其中，所述数据运算结果用于指示模乘运算结果。The data operator is used to receive the target data and perform operation processing on the target data to obtain a data operation result; wherein the data operation result is used to indicate a modular multiplication operation result.

第二方面，本申请提供一种模乘运算方法，该方法应用于处理器；包括：In a second aspect, the present application provides a modular multiplication method, which is applied to a processor; comprising:

响应于模乘运算指令，获取待运算数据；其中，所述待运算数据包括待计算数据和预计算数据；所述待计算数据表征需要进行模乘运算的数据；所述预计算数据表征在进行模乘运算过程中，所依赖的中间数据；所述预计算数据为基于所述待计算数据确定的数据；In response to the modular multiplication operation instruction, the data to be operated is obtained; wherein the data to be operated includes the data to be calculated and the pre-calculated data; the data to be calculated represents the data that needs to be operated on for modular multiplication; the pre-calculated data represents the intermediate data relied on in the process of modular multiplication operation; the pre-calculated data is the data determined based on the data to be calculated;

基于预先设置的数据运算指令，对所述待运算数据进行模乘运算，得到所述待运算数据对应的模乘运算结果；其中，所述预先设置的数据运算指令基于上述第一方面中任一项模乘运算装置实现。Based on the pre-set data operation instructions, modular multiplication operation is performed on the data to be operated to obtain the modular multiplication operation result corresponding to the data to be operated; wherein, the pre-set data operation instructions are implemented based on any modular multiplication operation device in the above-mentioned first aspect.

第三方面，本申请提供一种芯片，所述芯片包括第一方面中任一项所述的模乘运算装置。In a third aspect, the present application provides a chip, comprising the modular multiplication operation device described in any one of the first aspects.

第四方面，本申请提供一种板卡，所述板卡包括第三方面所述的芯片。In a fourth aspect, the present application provides a board card, wherein the board card includes the chip described in the third aspect.

第五方面本申请提供一种车载系统，所述车载系统包括第三方面所述的芯片或者包括第四方面所述的板卡。In a fifth aspect, the present application provides a vehicle-mounted system, which includes the chip described in the third aspect or the board described in the fourth aspect.

本申请提供的模乘运算方法、装置、芯片、板卡和车载系统，可以响应于模乘运算指令，来获取待运算数据，然后，再根据预先设置的数据运算指令，采用上述的运算装置，来对待运算数据进行模乘运算，得到待运算数据对应的模乘运算结果。这种实施方式，可以通过自定义数据运算指令，并通过硬件单元来实现模乘运算，从而可以提高模乘运算的运算处理速度，进而提升模乘运算的运算效率。此时，在将上述模乘运算方法应用于复杂算法(例如，椭圆曲线密码)时，可以提升复杂算法的计算性能，从而使复杂算法的应用更加流畅、广泛，提升了复杂算法的鲁棒性。进一步的，还可以提升应用该复杂算法的计算机设备(例如，车载系统)的性能，从而进一步提升使用该计算机设备的用户的使用体验。The modular multiplication operation method, device, chip, board and vehicle-mounted system provided by the present application can respond to the modular multiplication operation instruction to obtain the data to be operated, and then, according to the pre-set data operation instruction, the above-mentioned operation device is used to perform modular multiplication operation on the data to be operated, and obtain the modular multiplication operation result corresponding to the data to be operated. This implementation method can realize the modular multiplication operation through the customized data operation instruction and the hardware unit, so as to improve the operation processing speed of the modular multiplication operation, and then improve the operation efficiency of the modular multiplication operation. At this time, when the above-mentioned modular multiplication operation method is applied to a complex algorithm (for example, elliptic curve cryptography), the calculation performance of the complex algorithm can be improved, so that the application of the complex algorithm is smoother and more extensive, and the robustness of the complex algorithm is improved. Further, the performance of the computer device (for example, the vehicle-mounted system) applying the complex algorithm can also be improved, so as to further improve the user experience of the user using the computer device.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

此处的附图被并入说明书中并构成本说明书的一部分，示出了符合本申请的实施例，并与说明书一起用于解释本申请的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and, together with the description, serve to explain the principles of the present application.

图1为本申请实施例提供的一种椭圆曲线密码的通用软件架构示意图；FIG1 is a schematic diagram of a general software architecture of elliptic curve cryptography provided in an embodiment of the present application;

图2为本申请实施例提供的一种模乘运算装置的示意图；FIG2 is a schematic diagram of a modular multiplication operation device provided in an embodiment of the present application;

图3为本申请实施例提供的一种预先设置的运算指令的示意图；FIG3 is a schematic diagram of a pre-set operation instruction provided in an embodiment of the present application;

图4为本申请实施例提供的一种第一乘法扩展指令对应的运算装置的示意图；FIG4 is a schematic diagram of a computing device corresponding to a first multiplication extension instruction provided in an embodiment of the present application;

图5为本申请实施例提供的一种第二乘法扩展指令对应的运算装置的示意图；FIG5 is a schematic diagram of a computing device corresponding to a second multiplication extension instruction provided in an embodiment of the present application;

图6为本申请实施例提供的一种第三乘法扩展指令对应的运算装置的示意图；FIG6 is a schematic diagram of a computing device corresponding to a third multiplication extension instruction provided in an embodiment of the present application;

图7为本申请实施例提供的一种第四乘法扩展指令对应的运算装置的示意图；FIG7 is a schematic diagram of a computing device corresponding to a fourth multiplication extension instruction provided in an embodiment of the present application;

图8为本申请实施例提供的一种第一置零扩展指令对应的运算装置的示意图；FIG8 is a schematic diagram of a computing device corresponding to a first zeroing extension instruction provided in an embodiment of the present application;

图9为本申请实施例提供的一种第二置零扩展指令对应的运算装置的示意图；9 is a schematic diagram of a computing device corresponding to a second zero-set extension instruction provided in an embodiment of the present application;

图10为本申请实施例提供的一种数据保持扩展指令对应的运算装置的示意图；FIG10 is a schematic diagram of a computing device corresponding to a data retention extension instruction provided in an embodiment of the present application;

图11为本申请提供的一种模乘运算方法的流程示意图；FIG11 is a schematic diagram of a flow chart of a modular multiplication method provided by the present application;

图12为本申请提供的另一种模乘运算方法的流程示意图；FIG12 is a flow chart of another modular multiplication method provided by the present application;

图13为本申请实施例提供的一种n字长数据所对应的第一乘法指令的示意图；FIG13 is a schematic diagram of a first multiplication instruction corresponding to n-word length data provided in an embodiment of the present application;

图14为本申请实施例提供的一种n字长数据所对应的第二乘法指令的示意图；FIG14 is a schematic diagram of a second multiplication instruction corresponding to n-word length data provided in an embodiment of the present application;

图15为本申请实施例提供的一种n字长的数据实现乘法运算的流程示意图；FIG15 is a schematic diagram of a flow chart of implementing a multiplication operation on n-word-length data provided in an embodiment of the present application;

图16为本申请实施例提供的一种单字长的数据实现乘法运算的流程示意图；FIG16 is a schematic diagram of a flow chart of implementing a multiplication operation on single-word-length data provided in an embodiment of the present application;

图17为本申请实施例提供的一种蒙哥马利模乘运算的运算流程示意图；FIG17 is a schematic diagram of a calculation flow of a Montgomery modular multiplication operation provided in an embodiment of the present application;

图18为本申请实施例提供的一种模乘运算性能评估结果示意图；FIG18 is a schematic diagram of a modular multiplication performance evaluation result provided in an embodiment of the present application;

图19为本申请实施例提供的一种芯片的示意图；FIG19 is a schematic diagram of a chip provided in an embodiment of the present application;

图20为本申请实施例提供的一种板卡的示意图；FIG20 is a schematic diagram of a board provided in an embodiment of the present application;

图21为本申请实施例提供的一种车载系统的示意图。FIG. 21 is a schematic diagram of a vehicle-mounted system provided in an embodiment of the present application.

通过上述附图，已示出本申请明确的实施例，后文中将有更详细的描述。这些附图和文字描述并不是为了通过任何方式限制本申请构思的范围，而是通过参考特定实施例为本领域技术人员说明本申请的概念。The above drawings have shown clear embodiments of the present application, which will be described in more detail later. These drawings and text descriptions are not intended to limit the scope of the present application in any way, but to illustrate the concept of the present application to those skilled in the art by referring to specific embodiments.

具体实施方式Detailed ways

这里将详细地对示例性实施例进行说明，其示例表示在附图中。下面的描述涉及附图时，除非另有表示，不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反，它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are shown in the accompanying drawings. When the following description refers to the drawings, unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the present application. Instead, they are merely examples of devices and methods consistent with some aspects of the present application as detailed in the appended claims.

需要说明的是，本申请所涉及的用户信息(包括但不限于用户设备信息、用户个人信息等)和数据(包括但不限于用于分析的数据、存储的数据、展示的数据等)，均为经用户授权或者经过各方充分授权的信息和数据，并且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准，并提供有相应的操作入口，供用户选择授权或者拒绝。It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, stored data, displayed data, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data must comply with the relevant laws, regulations and standards of the relevant countries and regions, and provide corresponding operation entrances for users to choose to authorize or refuse.

目前，随着ECC算法的普及，ECC算法对应的计算数据量也在逐渐增大。At present, with the popularization of the ECC algorithm, the amount of calculation data corresponding to the ECC algorithm is also gradually increasing.

一个示例中，图1为本申请实施例提供的一种椭圆曲线密码的通用软件架构示意图。如图1所示，椭圆曲线运算主要包括“固定点标量乘”、“自由点标量乘”、“点加”和“倍加”等。其中，椭圆曲线运算的实现主要基于素域运算，例如，包括模加运算、模减运算、模乘运算和模逆运算等。此时，根据随机数发生器、哈希函数再结合椭圆曲线运算，可以实现椭圆曲线密码算法。此时，可以提供椭圆曲线密码算法接口，以对外实现基于椭圆曲线密码的加密/解密。In one example, FIG1 is a schematic diagram of a general software architecture of an elliptic curve cryptography provided in an embodiment of the present application. As shown in FIG1 , elliptic curve operations mainly include "fixed point scalar multiplication", "free point scalar multiplication", "point addition" and "doubling addition". Among them, the implementation of elliptic curve operations is mainly based on prime field operations, for example, including modular addition operations, modular subtraction operations, modular multiplication operations and modular inverse operations. At this time, the elliptic curve cryptography algorithm can be implemented according to the random number generator, hash function and elliptic curve operations. At this time, an elliptic curve cryptography algorithm interface can be provided to externally implement encryption/decryption based on elliptic curve cryptography.

目前，在基于椭圆曲线密码算法，进行素域的模乘运算时，可能会由于逐渐增加的数据量，导致内存溢出，或者出现计算效率降低的现象，进而影响了ECC算法的性能。At present, when performing modular multiplication operations on prime fields based on the elliptic curve cryptography algorithm, the gradually increasing amount of data may cause memory overflow or reduced computing efficiency, thereby affecting the performance of the ECC algorithm.

本申请提供的模乘运算方法，旨在通过设计模乘运算对应的硬件运算单元，并结合RISC-V扩展指令，来实现模乘运算，以解决现有技术的如上技术问题。The modular multiplication operation method provided in this application aims to implement modular multiplication operation by designing a hardware operation unit corresponding to the modular multiplication operation and combining it with RISC-V extended instructions to solve the above technical problems of the prior art.

下面以具体地实施例对本申请的技术方案以及本申请的技术方案如何解决上述技术问题进行详细说明。下面这几个具体的实施例可以相互结合，对于相同或相似的概念或过程可能在某些实施例中不再赘述。下面将结合附图，对本申请的实施例进行描述。The technical solution of the present application and how the technical solution of the present application solves the above-mentioned technical problems are described in detail below with specific embodiments. The following specific embodiments can be combined with each other, and the same or similar concepts or processes may not be repeated in some embodiments. The embodiments of the present application will be described below in conjunction with the accompanying drawings.

一个示例中，本申请实施例所提供的模乘运算装置中，乘法器、累加器和多路数据选择器所处理的单字长以32比特为例进行说明。基于此，由于第一寄存器、第二寄存器用于存储乘法器对应的乘法运算结果，此时，第一寄存器、第二寄存器所能存储的数据的字长为64比特。第三寄存器用于存储累加器的求和结果，因此，第三寄存器所能存储的数据的字长为67比特。In one example, in the modular multiplication operation device provided in the embodiment of the present application, the single word length processed by the multiplier, the accumulator and the multiplexer is described as 32 bits. Based on this, since the first register and the second register are used to store the multiplication operation result corresponding to the multiplier, at this time, the word length of the data that can be stored in the first register and the second register is 64 bits. The third register is used to store the summation result of the accumulator, so the word length of the data that can be stored in the third register is 67 bits.

这里需要说明的是，硬件处理单元所能处理的单字长除32比特之外，还可以为64比特、16比特、128比特等等，这里对硬件处理单元所处理的单字长不作限定，以能实现为准。It should be noted here that the single word length that the hardware processing unit can process, in addition to 32 bits, can also be 64 bits, 16 bits, 128 bits, etc. There is no limitation on the single word length processed by the hardware processing unit, which is subject to what can be achieved.

图2为本申请实施例提供的一种模乘运算装置的示意图，如图2所示，该运算装置包括：处理器21和数据运算器22；其中，数据运算器22包括乘法器201、累加器202、第一寄存器203、第二寄存器204、第三寄存器205、第一多路数据选择器206、第二多路数据选择器207、第三多路数据选择器208以及第四多路数据选择器209。Figure 2 is a schematic diagram of a modular multiplication operation device provided in an embodiment of the present application. As shown in Figure 2, the operation device includes: a processor 21 and a data operator 22; wherein the data operator 22 includes a multiplier 201, an accumulator 202, a first register 203, a second register 204, a third register 205, a first multiplexer 206, a second multiplexer 207, a third multiplexer 208 and a fourth multiplexer 209.

处理器21分别与乘法器201连接、第一多路数据选择器206连接、第二多路数据选择器207连接、第三多路数据选择器208连接；第一多路数据选择器206与第一寄存器203连接；第一寄存器203分别与第一多路数据选择器206连接、第二多路数据选择器207连接、第四多路数据选择器209连接；第二多路数据选择器207分别与第一寄存器203连接、第二寄存器204连接、处理器21连接；第二寄存器204分别与累加器202连接、第二多路数据选择器207连接；累加器202分别与第三多路数据选择器208连接、第四多路数据选择器209连接；第三多路数据选择器208与第三寄存器205连接；第三寄存器205分别与累加器202连接、第四多路数据选择器209连接。The processor 21 is connected to the multiplier 201, the first multiplexer 206, the second multiplexer 207, and the third multiplexer 208 respectively; the first multiplexer 206 is connected to the first register 203; the first register 203 is connected to the first multiplexer 206, the second multiplexer 207, and the fourth multiplexer 209 respectively; the second multiplexer 207 is connected to the first register 203, the second register 204, and the processor 21 respectively; the second register 204 is connected to the accumulator 202 and the second multiplexer 207 respectively; the accumulator 202 is connected to the third multiplexer 208 and the fourth multiplexer 209 respectively; the third multiplexer 208 is connected to the third register 205; the third register 205 is connected to the accumulator 202 and the fourth multiplexer 209 respectively.

其中，处理器21，用于接收到数据运算指令之后，获取与数据运算指令相匹配的目标数据；并将目标数据传输至数据运算器22中。The processor 21 is used for acquiring target data matching the data operation instruction after receiving the data operation instruction; and transmitting the target data to the data operator 22.

数据运算器22，用于接收目标数据，并对目标数据进行运算处理，得到数据运算结果；其中，数据运算结果用于进行模乘运算。The data operator 22 is used to receive target data and perform operations on the target data to obtain data operation results; wherein the data operation results are used to perform modular multiplication operations.

一个示例中，数据运算指令可以用于指示对目标数据进行运算处理的指令。数据运算指令可以用于指示对目标数据进行乘法运算处理，或者，对乘法运算结果进行输出处理。此时，目标数据可以表征乘数和被乘数，或者，用于表征乘法运算结果。In one example, the data operation instruction can be used to indicate an instruction to perform operation processing on the target data. The data operation instruction can be used to indicate to perform multiplication operation processing on the target data, or to output the multiplication operation result. In this case, the target data can represent the multiplier and the multiplicand, or be used to represent the multiplication operation result.

一个示例中，由于多路数据选择器所处理的数据的单字长为32bit，那么，在通过多路数据选择器传输数据时，若需要传输的数据的字长大于32bit，则可以根据需要传输的数据的字长，同时使用多个多路数据选择器，完成数据的传输，还可以通过重复使用单个多路数据选择器，来实现数据的传输。In an example, since the single word length of the data processed by the multiplexer is 32 bits, when transmitting data through the multiplexer, if the word length of the data to be transmitted is greater than 32 bits, multiple multiplexers can be used simultaneously according to the word length of the data to be transmitted to complete the data transmission. Data transmission can also be achieved by repeatedly using a single multiplexer.

例如，如图2所示的第一多路数据选择器206，该第一多路数据选择器206用于选择接收第一寄存器203所存储的数据、数值0和乘法器201输出的乘法运算结果中的一个，此时，在传输第一寄存器203所存储的数据以及乘法器201输出的乘法运算结果，可以通过重复使用单个多路数据选择器，也即第一多路数据选择器206实现数据的传输。For example, as shown in the first multiplexer 206 in FIG. 2 , the first multiplexer 206 is used to select one of the data stored in the first register 203, the value 0, and the multiplication result output by the multiplier 201. At this time, when transmitting the data stored in the first register 203 and the multiplication result output by the multiplier 201, data transmission can be achieved by repeatedly using a single multiplexer, i.e., the first multiplexer 206.

又例如，如图2所示的第二多路数据选择器207和第三多路数据选择器208，则在传输数据的字长大于32bit的情况下，通过同时使用多个多路数据选择器，来实现数据的传输。这里对数据传输所使用的多路数据选择器的数量不作限定，以满足实际需要为准。For another example, as shown in FIG2 , the second multiplexer 207 and the third multiplexer 208, when the word length of the transmitted data is greater than 32 bits, multiple multiplexers are used simultaneously to achieve data transmission. The number of multiplexers used for data transmission is not limited here, and is subject to meeting actual needs.

一个示例中，在使用图2所示的运算装置实现模乘运算时，可以先设置对应的数据运算指令，并根据数据运算指令进行运算处理，得到模乘运算结果。In one example, when using the computing device shown in FIG. 2 to implement modular multiplication, a corresponding data operation instruction may be set first, and then the operation may be performed according to the data operation instruction to obtain a modular multiplication result.

一个示例中，图3为本申请实施例提供的一种预先设置的运算指令的示意图，如图3所示，预先设置的运算指令包括：表征进行乘法运算并求和的第一乘法扩展指令，也即，图3所示的multadd指令，描述为“multadd rd,rs1,rs2”；表征进行乘法运算并求和后，保留高位的运算结果，输出低位的运算结果的第二乘法扩展指令，也即，图3所示的multaddh指令，描述为“multaddh rd,rs1,rs2”；表征进行乘法运算后输出低位的乘法运算结果的第三乘法扩展指令，也即，图3所示的mul指令，描述为“mul rd,rs1,rs2”；表征进行乘法运算后输出高位的乘法运算结果的第四乘法扩展指令，也即，图3所示的mulh指令，描述为“mulh rd,rs1,rs2”；表征输出低位数据后置零的第一置零扩展指令，也即，图3所示的rdlset0指令，描述为“rdlset0 rd”；表征输出高位数据后置零的第二置零扩展指令，也即，图3所示的rdhset0指令，描述为“rdhset0rd”；表征输出低位数据后保留剩余数据的数据保持扩展指令，也即，图3所示的rdlkeep指令，描述为“rdlkeep rd”。其中，rs1和rs2表示指令的输入数据，rd表示指令的输出数据。In one example, FIG3 is a schematic diagram of a pre-set operation instruction provided by an embodiment of the present application. As shown in FIG3, the pre-set operation instruction includes: a first multiplication extension instruction representing a multiplication operation and summing, that is, the multadd instruction shown in FIG3, described as "multadd rd, rs1, rs2"; a second multiplication extension instruction representing that after performing a multiplication operation and summing, the high-order operation result is retained and the low-order operation result is output, that is, the multaddh instruction shown in FIG3, described as "multaddh rd, rs1, rs2"; a third multiplication extension instruction representing that after performing a multiplication operation, the low-order multiplication operation result is output, that is, the mul instruction shown in FIG3, described as "mul rd, rs1, rs2"; a fourth multiplication extension instruction representing that after performing a multiplication operation, the high-order multiplication operation result is output, that is, the mulh instruction shown in FIG3, described as "mulh rd, rs1, rs2"; a first zeroing extension instruction representing that the low-order data is set to zero after output, that is, the rdlset0 instruction shown in FIG3, described as "rdlset0 rd"; represents a second zeroing extension instruction that sets the high-order data to zero after output, that is, the rdhset0 instruction shown in Figure 3, which is described as "rdhset0rd"; represents a data keeping extension instruction that retains the remaining data after outputting the low-order data, that is, the rdlkeep instruction shown in Figure 3, which is described as "rdlkeep rd". Among them, rs1 and rs2 represent the input data of the instruction, and rd represents the output data of the instruction.

一个示例中，上述预先设置的运算指令可以基于RISC-V进行设计，得到自定义的RISC-V扩展指令，也即预先设置的运算指令。此时，如图3所示，可以通过自定义RISC-V扩展指令中funct7、funct3两字段数值，以便于调用对应的运算指令。例如，multadd指令对应的funct7字段的数值可以为“0011100”，对应的funct3字段的数值可以为“000”；multaddh指令对应的funct7字段的数值可以为“0011101”，对应的funct3字段的数值可以为“000”；mul指令对应的funct7字段的数值可以为“0000001”，对应的funct3字段的数值可以为“000”；mulh指令对应的funct7字段的数值可以为“0000001”，对应的funct3字段的数值可以为“001”；rdlset0指令对应的funct7字段的数值可以为“0011110”，对应的funct3字段的数值可以为“000”；rdhset0指令对应的funct7字段的数值可以为“0011111”，对应的funct3字段的数值可以为“000”；rdlkeep指令对应的funct7字段的数值可以为“0100000”，对应的funct3字段的数值可以为“000”。In one example, the above-mentioned preset operation instructions can be designed based on RISC-V to obtain customized RISC-V extended instructions, that is, preset operation instructions. At this time, as shown in Figure 3, the values of the two fields funct7 and funct3 in the customized RISC-V extended instructions can be used to facilitate the calling of corresponding operation instructions. For example, the value of the funct7 field corresponding to the multadd instruction can be "0011100", and the value of the corresponding funct3 field can be "000"; the value of the funct7 field corresponding to the multaddh instruction can be "0011101", and the value of the corresponding funct3 field can be "000"; the value of the funct7 field corresponding to the mul instruction can be "0000001", and the value of the corresponding funct3 field can be "000"; the value of the funct7 field corresponding to the mulh instruction can be "0000 001", the corresponding funct3 field value can be "001"; the funct7 field value corresponding to the rdlset0 instruction can be "0011110", and the corresponding funct3 field value can be "000"; the funct7 field value corresponding to the rdhset0 instruction can be "0011111", and the corresponding funct3 field value can be "000"; the funct7 field value corresponding to the rdlkeep instruction can be "0100000", and the corresponding funct3 field value can be "000".

需要说明的是，本申请实施例中funct7、funct3两字段数值可以进行自定义设置，这里对funct7、funct3两字段数值不作限定，以能准确区分各个运算指令为准。It should be noted that in the embodiment of the present application, the values of the two fields funct7 and funct3 can be customized. There is no limitation on the values of the two fields funct7 and funct3, so as to accurately distinguish each operation instruction.

下面将结合具体的运算装置示意图，介绍图3中所示的各个运算指令的运算过程。The operation process of each operation instruction shown in FIG. 3 will be introduced below in conjunction with a specific schematic diagram of an operation device.

一个示例中，数据运算指令表征第一乘法扩展指令；其中，第一乘法扩展指令表征进行乘法运算并求和的指令；目标数据包括第一乘数和第一被乘数。In one example, the data operation instruction represents a first multiplication extension instruction; wherein the first multiplication extension instruction represents an instruction for performing a multiplication operation and summing; and the target data includes a first multiplier and a first multiplicand.

此时，参见图4，图4为本申请实施例提供的一种第一乘法扩展指令对应的运算装置的示意图，如图4中的实线所示：At this time, referring to FIG. 4 , FIG. 4 is a schematic diagram of a computing device corresponding to a first multiplication extension instruction provided in an embodiment of the present application, as shown by the solid line in FIG. 4 :

处理器41，用于接收到第一乘法扩展指令后，将第一乘数(如图4所示的rs1)和第一被乘数(如图4所示的rs2)，传输至乘法器401，以基于乘法器401，对第一乘数(如图4所示的rs1)和第一被乘数(如图4所示的rs2)进行相乘运算，得到第一乘法运算结果；将第一乘法运算结果，通过第一多路数据选择器406，传输并保存至第一寄存器403中。The processor 41 is used for transmitting the first multiplier (rs1 as shown in Figure 4) and the first multiplicand (rs2 as shown in Figure 4) to the multiplier 401 after receiving the first multiplication extension instruction, so as to perform a multiplication operation on the first multiplier (rs1 as shown in Figure 4) and the first multiplicand (rs2 as shown in Figure 4) based on the multiplier 401 to obtain a first multiplication operation result; and transmitting and saving the first multiplication operation result to the first register 403 through the first multiplexer 406.

第一寄存器403，用于接收第一乘法运算结果，并将第一乘法运算结果通过第二多路数据选择器407，传输并保存至第二寄存器404中。The first register 403 is used to receive the first multiplication result, and transmit the first multiplication result to the second register 404 through the second multiplexer 407 and save it.

第二寄存器404，用于获取第三寄存器405中所存储的数据，并将第三寄存器405中所存储的数据和接收的第一乘法运算结果，传输至累加器402中。The second register 404 is used to obtain the data stored in the third register 405 , and transmit the data stored in the third register 405 and the received first multiplication result to the accumulator 402 .

累加器402，用于对第三寄存器中所存储的数据和第一乘法运算结果进行加法运算，得到第一加法运算结果(如图4所示的mult_add_data)；将第一加法运算结果(如图4所示的mult_add_data)通过第三多路数据选择器408保存至第三寄存器405中。The accumulator 402 is used to perform an addition operation on the data stored in the third register and the first multiplication result to obtain a first addition operation result (mult_add_data as shown in Figure 4); and save the first addition operation result (mult_add_data as shown in Figure 4) to the third register 405 through the third multiplexer 408.

此时，第三寄存器405中所存储的数据，即为通过第一乘法扩展指令multadd指令，对rs1和rs2进行运算处理之后得到的结果rd。At this time, the data stored in the third register 405 is the result rd obtained after the first multiplication extension instruction multadd is used to perform operation processing on rs1 and rs2.

一个示例中，数据运算指令表征第二乘法扩展指令；其中，第二乘法扩展指令表征进行乘法运算并求和后，保留高位的运算结果，输出低位的运算结果的指令；目标数据包括第二乘数和第二被乘数。In one example, the data operation instruction represents a second multiplication extension instruction; wherein the second multiplication extension instruction represents an instruction for performing multiplication and summing, retaining the high-order operation result, and outputting the low-order operation result; the target data includes a second multiplier and a second multiplicand.

此时，参见图5，图5为本申请实施例提供的一种第二乘法扩展指令对应的运算装置的示意图，如图5中的实线所示：At this time, referring to FIG. 5 , FIG. 5 is a schematic diagram of a computing device corresponding to a second multiplication extension instruction provided in an embodiment of the present application, as shown by the solid line in FIG. 5 :

处理器51，用于接收到第二乘法扩展指令后，将第二乘数(如图5所示的rs1)和第二被乘数(如图5所示的rs2)，传输至乘法器501，以基于乘法器501，对第二乘数(如图5所示的rs1)和第二被乘数(如图5所示的rs2)进行相乘运算，得到第二乘法运算结果；将第二乘法运算结果通过第一多路数据选择器506，传输并保存至第一寄存器503中。The processor 51 is used for transmitting the second multiplier (rs1 as shown in FIG. 5 ) and the second multiplicand (rs2 as shown in FIG. 5 ) to the multiplier 501 after receiving the second multiplication extension instruction, so as to perform a multiplication operation on the second multiplier (rs1 as shown in FIG. 5 ) and the second multiplicand (rs2 as shown in FIG. 5 ) based on the multiplier 501 to obtain a second multiplication operation result; and transmitting the second multiplication operation result to the first register 503 through the first multiplexer 506 and saving it.

第一寄存器503，用于接收第二乘法运算结果，并将第二乘法运算结果通过第二多路数据选择器506，传输并保存至第二寄存器504中。The first register 503 is used to receive the second multiplication result, and transmit the second multiplication result to the second register 504 through the second multiplexer 506 and save it.

第二寄存器504，用于将第二乘法运算结果，传输至累加器502中。The second register 504 is used to transmit the second multiplication result to the accumulator 502 .

累加器502，用于获取第三寄存器505中所存储的数据，并对第三寄存器505中所存储的数据和接收的第二乘法运算结果进行加法运算，得到第二加法运算结果(如图5所示的mult_add_data)。The accumulator 502 is used to obtain the data stored in the third register 505, and perform an addition operation on the data stored in the third register 505 and the received second multiplication operation result to obtain a second addition operation result (such as mult_add_data shown in FIG. 5 ).

处理器51，还用于通过第四多路数据选择器509，获取并输出第二加法运算结果(如图5所示的mult_add_data)中的低位数据；以及，通过第三多路数据选择器508，将第二加法运算结果中的高位数据保存至第三寄存器505中。The processor 51 is also used to obtain and output the low-order data in the second addition operation result (mult_add_data as shown in Figure 5) through the fourth multiplexer 509; and save the high-order data in the second addition operation result to the third register 505 through the third multiplexer 508.

此时，输出的第二加法运算结果中的低位数据，即为通过第二乘法扩展指令multaddh指令，对rs1和rs2进行运算处理之后得到的结果rd。At this time, the low-order data in the output second addition operation result is the result rd obtained after the second multiplication extension instruction multaddh instruction is used to perform operation processing on rs1 and rs2.

一个示例中，数据运算指令表征第三乘法扩展指令；其中，第三乘法扩展指令表征进行乘法运算后输出低位的乘法运算结果的指令；目标数据包括第三乘数和第三被乘数。此时，参见图6，图6为本申请实施例提供的一种第三乘法扩展指令对应的运算装置的示意图，如图6中的实线所示：In one example, the data operation instruction represents a third multiplication extension instruction; wherein the third multiplication extension instruction represents an instruction for outputting a low-order multiplication result after performing a multiplication operation; and the target data includes a third multiplier and a third multiplicand. At this time, referring to FIG. 6 , FIG. 6 is a schematic diagram of an operation device corresponding to a third multiplication extension instruction provided in an embodiment of the present application, as shown by the solid line in FIG. 6 :

处理器61，用于接收到第三乘法扩展指令后，将第三乘数(如图6所示的rs1)和第三被乘数(如图6所示的rs2)传输至乘法器601中。The processor 61 is configured to transmit the third multiplier (such as rs1 as shown in FIG. 6 ) and the third multiplicand (such as rs2 as shown in FIG. 6 ) to the multiplier 601 after receiving the third multiplication extension instruction.

乘法器601，用于接收第三乘数(如图6所示的rs1)和第三被乘数(如图6所示的rs2)；对第三乘数(如图6所示的rs1)和第三被乘数(如图6所示的rs2)进行乘法运算，得到第三乘法运算结果；将第三乘法运算结果通过第一多路数据选择器606传输并保存至第一寄存器603中。The multiplier 601 is used to receive a third multiplier (rs1 as shown in FIG6 ) and a third multiplicand (rs2 as shown in FIG6 ); perform multiplication operation on the third multiplier (rs1 as shown in FIG6 ) and the third multiplicand (rs2 as shown in FIG6 ) to obtain a third multiplication operation result; and transmit the third multiplication operation result through the first multiplexer 606 and save it in the first register 603.

处理器61，还用于基于第四多路数据选择器609，选择并输出第三乘法运算结果中的低位数据。The processor 61 is further configured to select and output low-order data in the third multiplication operation result based on the fourth multiplexer 609 .

此时，输出的第三乘法运算结果中的低位数据，即为通过第三乘法扩展指令mul指令，对rs1和rs2进行运算处理之后得到的结果rd。At this time, the low-order data in the output third multiplication result is the result rd obtained after the third multiplication extension instruction mul is used to perform operation processing on rs1 and rs2.

一个示例中，数据运算指令表征第四乘法扩展指令；其中，第四乘法扩展指令表征进行乘法运算后输出高位的乘法运算结果的指令；目标数据包括第四乘数、第四被乘数和第一目标数值。此时，参见图7，图7为本申请实施例提供的一种第四乘法扩展指令对应的运算装置的示意图，如图7中的实线所示：In one example, the data operation instruction represents a fourth multiplication extension instruction; wherein the fourth multiplication extension instruction represents an instruction for outputting a high-order multiplication result after performing a multiplication operation; the target data includes a fourth multiplier, a fourth multiplicand, and a first target value. At this time, referring to FIG. 7 , FIG. 7 is a schematic diagram of an operation device corresponding to a fourth multiplication extension instruction provided in an embodiment of the present application, as shown by the solid line in FIG. 7 :

处理器71，用于接收到第四乘法扩展指令后，将第四乘数(图7所示的rs1)和第四被乘数(图7所示的rs2)传输至乘法器701中。The processor 71 is configured to transmit a fourth multiplier (rs1 shown in FIG. 7 ) and a fourth multiplicand (rs2 shown in FIG. 7 ) to the multiplier 701 after receiving the fourth multiplication extension instruction.

乘法器701，用于接收第四乘数(图7所示的rs1)和第四被乘数(图7所示的rs2)；对第四乘数(图7所示的rs1)和第四被乘数(图7所示的rs2)进行乘法运算，得到第四乘法运算结果；将第四乘法运算结果通过第一多路数据选择器706传输并保存至第一寄存器中703。The multiplier 701 is used to receive a fourth multiplier (rs1 shown in FIG. 7 ) and a fourth multiplicand (rs2 shown in FIG. 7 ); perform multiplication operation on the fourth multiplier (rs1 shown in FIG. 7 ) and the fourth multiplicand (rs2 shown in FIG. 7 ) to obtain a fourth multiplication operation result; and transmit the fourth multiplication operation result through the first multiplexer 706 and save it in the first register 703.

第一寄存器703，用于将第四乘法运算结果，通过第二多路数据选择器707，传输至第二寄存器704中。The first register 703 is used to transmit the fourth multiplication result to the second register 704 through the second multiplexer 707 .

处理器71，还用于将第二寄存器704中所存储的第四乘法运算结果中的高位数据，通过第四多路数据选择器709输出；将第一目标数值(第一目标数值为图7所示的数值0)通过第二多路数据选择器707传输至第二寄存器704，以对第二寄存器704进行置零处理；将第一目标数值通过第三多路数据选择器708传输至第三寄存器705，以对第三寄存器705进行置零处理。The processor 71 is also used to output the high-order data in the fourth multiplication result stored in the second register 704 through the fourth multiplexer 709; transmit the first target value (the first target value is the value 0 shown in Figure 7) to the second register 704 through the second multiplexer 707 to set the second register 704 to zero; transmit the first target value to the third register 705 through the third multiplexer 708 to set the third register 705 to zero.

此时，输出的第四乘法运算结果中的高位数据，即为通过第四乘法扩展指令mulh指令，对rs1和rs2进行运算处理之后得到的结果rd。At this time, the high-order data in the output fourth multiplication result is the result rd obtained after the fourth multiplication extension instruction mulh is used to perform operation processing on rs1 and rs2.

此时，通过对第三寄存器705和第二寄存器704进行置零处理，可以为下一次运算做准备。At this time, by performing zeroing processing on the third register 705 and the second register 704, preparation for the next operation can be made.

一个示例中，数据运算指令表征第一置零扩展指令；其中，第一置零扩展指令表征输出低位数据后置零的指令；目标数据包括累加器运算后的求和运算结果和第一目标数值。此时，参见图8，图8为本申请实施例提供的一种第一置零扩展指令对应的运算装置的示意图，如图8中的实线所示：In one example, the data operation instruction represents a first zeroing extension instruction; wherein the first zeroing extension instruction represents an instruction to output low-order data and then set it to zero; the target data includes the summation result after the accumulator operation and the first target value. At this time, referring to FIG8 , FIG8 is a schematic diagram of an operation device corresponding to a first zeroing extension instruction provided in an embodiment of the present application, as shown by the solid line in FIG8 :

处理器81，用于基于第四多路数据选择器809，选择并输出累加器运算后的求和运算结果(也即图8所示的mult_add_data)中的低位数据；将第一目标数值(第一目标数值为图8所示的数值0)通过第二多路数据选择器807，传输至第二寄存器804，以对第二寄存器804进行置零处理；将第一目标数值通过第三多路数据选择器808，传输至第三寄存器805，以对第三寄存器805进行置零处理。The processor 81 is used to select and output the low-order data in the summation result after the accumulator operation (i.e., mult_add_data shown in Figure 8) based on the fourth multiplexer 809; transmit the first target value (the first target value is the value 0 shown in Figure 8) to the second register 804 through the second multiplexer 807 to set the second register 804 to zero; transmit the first target value to the third register 805 through the third multiplexer 808 to set the third register 805 to zero.

一个示例中，数据运算指令表征第二置零扩展指令；其中，第二置零扩展指令表征输出高位数据后置零的指令；目标数据包括第三寄存器所存储的数据和第一目标数值。此时，参见图9，图9为本申请实施例提供的一种第二置零扩展指令对应的运算装置的示意图，如图9中的实线所示：In one example, the data operation instruction represents a second zeroing extension instruction; wherein the second zeroing extension instruction represents an instruction to output high-order data and then set it to zero; the target data includes the data stored in the third register and the first target value. At this time, referring to FIG. 9 , FIG. 9 is a schematic diagram of a computing device corresponding to a second zeroing extension instruction provided in an embodiment of the present application, as shown by the solid line in FIG. 9 :

处理器91，用于基于第四多路数据选择器909，输出第三寄存器905所存储的数据中的高位数据；将第一目标数值(第一目标数值为图9所示的数值0)通过第二多路数据选择器907，传输至第二寄存器904，以对第二寄存器904进行置零处理；将第一目标数值通过第三多路数据选择器908，传输至第三寄存器905，以对第三寄存器905进行置零处理。The processor 91 is used to output the high-order data in the data stored in the third register 905 based on the fourth multiplexer 909; transmit the first target value (the first target value is the value 0 shown in Figure 9) to the second register 904 through the second multiplexer 907 to set the second register 904 to zero; transmit the first target value to the third register 905 through the third multiplexer 908 to set the third register 905 to zero.

一个示例中，数据运算指令表征数据保持扩展指令；其中，数据保持扩展指令表征输出低位数据后保留剩余数据的指令；目标数据包括第二寄存器所存储的数据和第三寄存器所存储的数据。此时，参见图10，图10为本申请实施例提供的一种数据保持扩展指令对应的运算装置的示意图，如图10中的实线所示：In one example, the data operation instruction represents a data retention extension instruction; wherein the data retention extension instruction represents an instruction to retain the remaining data after outputting the low-order data; the target data includes the data stored in the second register and the data stored in the third register. At this time, referring to FIG. 10, FIG. 10 is a schematic diagram of a computing device corresponding to a data retention extension instruction provided in an embodiment of the present application, as shown by the solid line in FIG. 10:

累加器1002，用于接收第二寄存器1004和第三寄存器1005所存储的数据；对第二寄存器1004所存储的数据与第三寄存器1005所存储的数据进行加法运算，得到第三加法运算结果。The accumulator 1002 is used to receive the data stored in the second register 1004 and the third register 1005; perform an addition operation on the data stored in the second register 1004 and the data stored in the third register 1005 to obtain a third addition operation result.

处理器101，用于在累加器1002得到第三加法运算结果之后，将第三加法运算结果中的低位数据通过第四多路数据选择器1009输出；并将第三加法运算结果中的高位数据通过第三多路数据选择器1008，传输并保存至第三寄存器1005中。Processor 101 is used to output the low-order data in the third addition result through the fourth multiplexer 1009 after the accumulator 1002 obtains the third addition result; and transmit and save the high-order data in the third addition result to the third register 1005 through the third multiplexer 1008.

基于上述模乘运算装置所能实现的数据运算指令，本申请还提供一种模乘运算方法。参见图11，图11为本申请提供的一种模乘运算方法的流程示意图，如图11所示，该方法可以应用于上述任一实施例中的处理器，包括：Based on the data operation instructions that can be implemented by the above modular multiplication operation device, the present application also provides a modular multiplication operation method. Referring to FIG. 11 , FIG. 11 is a flow chart of a modular multiplication operation method provided by the present application. As shown in FIG. 11 , the method can be applied to the processor in any of the above embodiments, including:

S1101、响应于模乘运算指令，获取待运算数据。S1101. In response to a modular multiplication instruction, obtain data to be operated.

其中，待运算数据包括待计算数据和预计算数据；待计算数据表征需要进行模乘运算的数据；预计算数据表征在进行模乘运算过程中，所依赖的中间数据；预计算数据为基于待计算数据确定的数据。Among them, the data to be calculated includes the data to be calculated and the pre-calculated data; the data to be calculated represents the data that needs to be subjected to modular multiplication operation; the pre-calculated data represents the intermediate data relied on in the process of modular multiplication operation; the pre-calculated data is the data determined based on the data to be calculated.

一个示例中，待计算数据可以理解为需要进行模乘运算的数据，例如，待计算数据可以包括乘数x、被乘数y和模数m。In one example, the data to be calculated may be understood as data that requires a modular multiplication operation. For example, the data to be calculated may include a multiplier x, a multiplicand y, and a modulus m.

一个示例中，预计算数据可以包括模幂数据R、第一中间数据w和第二中间数据nm。此时，该待计算数据可以基于待计算数据确定，此时，模幂数据R的取值为R＝2^N，其中，2^N-1<m≤2^N。第一中间数据w的取值为w＝-m^-1mod R。第二中间数据nm的取值为nm＝～m mod R。In one example, the pre-calculated data may include modular exponentiation data R, first intermediate data w, and second intermediate data nm. In this case, the data to be calculated may be determined based on the data to be calculated. In this case, the value of the modular exponentiation data R is R=2 ^N , where 2 ^N-1 <m≤2 ^N . The value of the first intermediate data w is w=-m ^-1 mod R. The value of the second intermediate data nm is nm=~m mod R.

S1102、基于预先设置的数据运算指令，对待运算数据进行模乘运算，得到待运算数据对应的模乘运算结果。S1102. Based on a preset data operation instruction, perform a modular multiplication operation on the data to be operated to obtain a modular multiplication operation result corresponding to the data to be operated.

其中，预先设置的数据运算指令基于上述任一项模乘运算装置实现。Among them, the pre-set data operation instruction is implemented based on any of the above-mentioned modular multiplication operation devices.

一个示例中，本申请实施例提供的模乘运算方法可以用于计算复杂算法，例如，椭圆曲线密码，这里对该模乘运算方法所能应用的复杂算法不作限定，以能实现为准。In one example, the modular multiplication method provided in the embodiment of the present application can be used to calculate complex algorithms, such as elliptic curve cryptography. There is no limitation on the complex algorithms to which the modular multiplication method can be applied, subject to implementation.

通过上述描述可知，本申请实施例，可以响应于模乘运算指令，来获取待运算数据，然后，再根据预先设置的数据运算指令，采用上述的运算装置，来对待运算数据进行模乘运算，得到待运算数据对应的模乘运算结果。这种实施方式，可以通过自定义数据运算指令，并通过硬件单元来实现模乘运算，从而可以提高模乘运算的运算处理速度，进而提升模乘运算的运算效率。此时，在将上述模乘运算方法应用于复杂算法(例如，椭圆曲线密码)时，可以提升复杂算法的计算性能，从而使复杂算法的应用更加流畅、广泛，提升了复杂算法的鲁棒性。进一步的，还可以提升应用该复杂算法的计算机设备(例如，车载系统)的性能，从而进一步提升使用该计算机设备的用户的使用体验。It can be known from the above description that the embodiment of the present application can obtain the data to be calculated in response to the modular multiplication operation instruction, and then, according to the pre-set data operation instruction, the above-mentioned operation device is used to perform modular multiplication operation on the data to be calculated, and obtain the modular multiplication operation result corresponding to the data to be calculated. This implementation method can realize the modular multiplication operation through the customized data operation instruction and the hardware unit, so as to improve the operation processing speed of the modular multiplication operation, and then improve the operation efficiency of the modular multiplication operation. At this time, when the above-mentioned modular multiplication operation method is applied to a complex algorithm (for example, elliptic curve cryptography), the computing performance of the complex algorithm can be improved, so that the application of the complex algorithm is smoother and more extensive, and the robustness of the complex algorithm is improved. Further, the performance of the computer device (for example, a vehicle-mounted system) that applies the complex algorithm can also be improved, thereby further improving the user experience of the user using the computer device.

一个示例中，参见图12，图12为本申请提供的另一种模乘运算方法的流程示意图，如图12所示，该方法包括：In one example, referring to FIG. 12 , FIG. 12 is a flow chart of another modular multiplication operation method provided by the present application. As shown in FIG. 12 , the method includes:

S1201、响应于模乘运算指令，获取待运算数据。S1201. In response to a modular multiplication instruction, obtain data to be operated.

一个示例中，本步骤可以参见上述S1101所描述的内容，这里不再详细赘述。In an example, this step can refer to the content described in S1101 above, and will not be repeated here in detail.

一个示例中，在获取待运算数据之后，基于预先设置的数据运算指令，对待运算数据进行模乘运算，得到待运算数据对应的模乘运算结果之前，本申请实施例可以先基于预先设置的数据运算指令，确定实现模乘运算的第一乘法指令和第二乘法指令(其中，第一乘法指令用于通过第一乘法运算确定乘法运算结果中的高位数据；第二乘法指令用于通过第二乘法运算确定乘法运算结果中的低位数据)，具体参见下面所描述的步骤。In one example, after obtaining the data to be operated, modular multiplication operation is performed on the data to be operated based on the preset data operation instructions. Before obtaining the modular multiplication operation result corresponding to the data to be operated, the embodiment of the present application can first determine the first multiplication instruction and the second multiplication instruction for implementing the modular multiplication operation based on the preset data operation instructions (wherein the first multiplication instruction is used to determine the high-order data in the multiplication operation result through the first multiplication operation; the second multiplication instruction is used to determine the low-order data in the multiplication operation result through the second multiplication operation), refer to the steps described below for details.

S1202、基于待运算数据对应的数据字长和预先设置的数据运算指令，确定第一乘法指令和第二乘法指令。S1202. Determine a first multiplication instruction and a second multiplication instruction based on the data word length corresponding to the data to be operated and a preset data operation instruction.

其中，第一乘法指令用于实现第一乘法运算；第二乘法指令用于实现第二乘法运算。The first multiplication instruction is used to implement a first multiplication operation; and the second multiplication instruction is used to implement a second multiplication operation.

一个示例中，待运算数据对应的数据字长可以大于或者等于上述模乘运算装置所能处理数据的单字长长度(也即32bit)。In one example, the data word length corresponding to the data to be operated may be greater than or equal to the single word length (ie, 32 bits) of the data that can be processed by the modular multiplication operation device.

此时，在待运算数据对应的数据字长为多字长的情况下，则预先设置的数据运算指令包括第一乘法扩展指令、第二乘法扩展指令、第一置零扩展指令、第二置零扩展指令和数据保持扩展指令。此时，基于待运算数据对应的数据字长和预先设置的数据运算指令，确定第一乘法指令和第二乘法指令，具体包括如下过程：At this time, in the case where the data word length corresponding to the data to be operated is a multi-word length, the pre-set data operation instruction includes a first multiplication extension instruction, a second multiplication extension instruction, a first zero extension instruction, a second zero extension instruction and a data retention extension instruction. At this time, based on the data word length corresponding to the data to be operated and the pre-set data operation instruction, determining the first multiplication instruction and the second multiplication instruction specifically includes the following process:

首先，确定待运算数据中的被乘数和乘数对应的目标字长。First, the target word lengths corresponding to the multiplicand and the multiplier in the data to be operated are determined.

其中，目标字长表征被乘数或者乘数的数据字长。此时，目标字长可以表示为n，其中，n的取值为大于1的正整数。The target word length represents the data word length of the multiplicand or the multiplier. At this time, the target word length can be expressed as n, where the value of n is a positive integer greater than 1.

一个示例中，目标字长可以理解为被乘数对应的第一字长和乘数对应的第二字长中的最大值，例如，在被乘数对应的第一字长为5，乘数对应的第二字长为3的情况下，则目标字长的取值即为5。又例如，在被乘数对应的第一字长为5，乘数对应的第二字长为5的情况下，则目标字长的取值即为5。In one example, the target word length can be understood as the maximum value of the first word length corresponding to the multiplicand and the second word length corresponding to the multiplier. For example, when the first word length corresponding to the multiplicand is 5 and the second word length corresponding to the multiplier is 3, the value of the target word length is 5. For another example, when the first word length corresponding to the multiplicand is 5 and the second word length corresponding to the multiplier is 5, the value of the target word length is 5.

然后，基于目标字长、第一乘法扩展指令、第二乘法扩展指令、第一置零扩展指令、第二置零扩展指令和数据保持扩展指令，确定第一乘法指令和第二乘法指令。Then, based on the target word length, the first multiplication extension instruction, the second multiplication extension instruction, the first set-zero extension instruction, the second set-zero extension instruction, and the data retention extension instruction, the first multiplication instruction and the second multiplication instruction are determined.

一个示例中，基于目标字长、第一乘法扩展指令、第二乘法扩展指令、第一置零扩展指令、第二置零扩展指令和数据保持扩展指令，确定第一乘法指令，具体包括如下步骤：In one example, determining the first multiplication instruction based on the target word length, the first multiplication extension instruction, the second multiplication extension instruction, the first zero extension instruction, the second zero extension instruction and the data retention extension instruction specifically includes the following steps:

首先，基于单字长所对应的数据长度和目标字长，对被乘数和乘数进行切分处理，得到被乘数对应的多个第一子数据和乘数对应的多个第二子数据。Firstly, based on the data length corresponding to the single word length and the target word length, the multiplicand and the multiplier are segmented to obtain a plurality of first sub-data corresponding to the multiplicand and a plurality of second sub-data corresponding to the multiplier.

一个示例中，假设，单字长对应的数据长度为32bit，则在对被乘数和乘数进行切分处理时，可以分别从被乘数和乘数的最低位开始，以32bit为单元进行切分处理，得到被乘数对应的多个第一子数据和乘数对应的多个第二子数据。In an example, assuming that the data length corresponding to a single word length is 32 bits, when the multiplicand and the multiplier are segmented, they can be segmented starting from the lowest bit of the multiplicand and the multiplier, respectively, with 32 bits as units to obtain multiple first sub-data corresponding to the multiplicand and multiple second sub-data corresponding to the multiplier.

然后，确定与第一乘法指令相匹配的第一数据组合模式；并基于相匹配的第一数据组合模式，确定多个第一相乘数据组。Then, a first data combination pattern matching the first multiplication instruction is determined; and based on the matching first data combination pattern, a plurality of first multiplication data groups are determined.

其中，第一相乘数据组用于指示需要进行相乘计算的第一子数据和第二子数据。The first multiplication data group is used to indicate the first sub-data and the second sub-data that need to be multiplied.

一个示例中，第一数据组合模式用于指示第一子数据和第二子数据进行运算处理时的组合方式，例如，假设，第一子数据记为X_i，其中，i表示被乘数中的第i个第一子数据。假设，第二子数据记为Y_j，其中，j表示乘数中的第j个第二子数据。那么，第一数据组合模式就可以为i+j＝2n，此时，可以将第i个第一子数据和第j个第二子数据，且i和j之和为2n的第一子数据和第二子数据进行组合，以进行运算处理。此时，基于第i个第一子数据和第j个第二子数据，且i和j之和为2n的第一子数据和第二子数据确定第一相乘数据组。In one example, the first data combination mode is used to indicate the combination mode of the first sub-data and the second sub-data when performing operation processing. For example, assume that the first sub-data is recorded as _Xi , where i represents the i-th first sub-data in the multiplicand. Assume that the second sub-data is recorded as _Yj , where j represents the j-th second sub-data in the multiplier. Then, the first data combination mode can be i+j=2n. At this time, the i-th first sub-data and the j-th second sub-data, and the first sub-data and the second sub-data whose sum is 2n, can be combined for operation processing. At this time, the first multiplied data group is determined based on the i-th first sub-data and the j-th second sub-data, and the first sub-data and the second sub-data whose sum is 2n.

一个示例中，与第一乘法指令相匹配的第一数据组合模式的数量可以为多个，例如，第一数据组合模式所确定的i+j的取值可以为2n，2n-1，…，n，…，3，2。In one example, there may be multiple first data combination patterns matching the first multiplication instruction. For example, the value of i+j determined by the first data combination pattern may be 2n, 2n-1, ..., n, ..., 3, 2.

此时，基于第一数据组合模式所确定的第一相乘数据组的数量可以为一个，也可以为多个。例如，在i+j的取值为2n时，i和j的取值均为n，此时，第一相乘数据组的数量为1。在i+j的取值为3时，i为1，j为2，或者，i为2，j为1，此时，可以确定第一相乘数据组的数量为2。At this time, the number of the first multiplication data groups determined based on the first data combination mode can be one or more. For example, when the value of i+j is 2n, the values of i and j are both n, and at this time, the number of the first multiplication data groups is 1. When the value of i+j is 3, i is 1 and j is 2, or i is 2 and j is 1, and at this time, it can be determined that the number of the first multiplication data groups is 2.

接着，基于第一数据组合模式的数量，以及基于每个第一数据组合模式确定的第一相乘数据组的数量，确定第一乘法指令所包括的第一乘法扩展指令的第一数量和包括的第二乘法扩展指令的第二数量。Next, based on the number of first data combination patterns and the number of first multiplication data groups determined based on each first data combination pattern, a first number of first multiplication extension instructions and a second number of second multiplication extension instructions included in the first multiplication instruction are determined.

最后，基于第一数量个第一乘法扩展指令、第二数量个第二乘法扩展指令、数据保持扩展指令和第二置零扩展指令，确定第一乘法指令。Finally, a first multiplication instruction is determined based on the first number of first multiplication extension instructions, the second number of second multiplication extension instructions, the data retention extension instruction, and the second zeroing extension instruction.

一个示例中，图13为本申请实施例提供的一种n字长数据所对应的第一乘法指令的示意图，如图13所示，在对各第一数据组合模式(例如，图13所示的i+j＝2n，i+j＝2n-1，i+j＝2n-2，i+j＝2n-3，…，i+j＝n+1，i+j＝n，…，i+j＝4，i+j＝3，i+j＝2)下的各第一相乘数据组执行第一乘法扩展指令和第二乘法扩展指令时，在确定出首次进行运算的情况下，则对第一相乘数据组中的第一子数据和第二子数据执行第一乘法扩展指令。之后，则对各第一数据组合模式中的，除最后一个第一数据组合模式下的各第一相乘数据组内的第一子数据和第二子数据先执行第二乘法扩展指令，直至该第一数据组合模式下的最后一个第一相乘数据组，使该最后一个第一相乘数据组内的第一子数据和第二子数据执行第一乘法扩展指令。接着，对最后一个第一数据组合模式下的第一相乘数据组内的第一子数据和第二子数据执行第二乘法扩展指令，并在执行完成之后，继续执行数据保持扩展指令和第二置零扩展指令。In one example, FIG. 13 is a schematic diagram of a first multiplication instruction corresponding to n-word length data provided by an embodiment of the present application. As shown in FIG. 13, when executing the first multiplication extension instruction and the second multiplication extension instruction for each first multiplication data group under each first data combination mode (for example, i+j=2n, i+j=2n-1, i+j=2n-2, i+j=2n-3, ..., i+j=n+1, i+j=n, ..., i+j=4, i+j=3, i+j=2 as shown in FIG. 13), when determining that the operation is performed for the first time, the first multiplication extension instruction is executed for the first sub-data and the second sub-data in the first multiplication data group. Afterwards, the second multiplication extension instruction is first executed for the first sub-data and the second sub-data in each first multiplication data group in each first data combination mode except the last first data combination mode, until the last first multiplication data group in the first data combination mode, so that the first multiplication extension instruction is executed for the first sub-data and the second sub-data in the last first multiplication data group. Next, the second multiplication extension instruction is executed on the first sub-data and the second sub-data in the first multiplication data group in the last first data combination mode, and after the execution is completed, the data retention extension instruction and the second zeroing extension instruction are continued to be executed.

此时，可以通过输出第n+1个字的值、…、第4个字的值、第3个字的值、第2个字的值、第1个字的值(也即最高32bit的值)，来得到高位的乘法运算结果。At this time, the high-order multiplication result can be obtained by outputting the value of the n+1th word, ..., the value of the 4th word, the value of the 3rd word, the value of the 2nd word, and the value of the 1st word (that is, the value of the highest 32 bits).

一个示例中，针对图13所示的第一乘法指令的示意图可以确定，在n的取值为2的情况下，则，第一数据组合模式可以为：i+j＝4、i+j＝3、i+j＝2，此时，第一数据组合模式的数量即为3。在第一数据组合模式为i+j＝4的情况下，确定的第一相乘数据组的数量即为1；在第一数据组合模式为i+j＝3的情况下，确定的第一相乘数据组的数量即为2；在第一数据组合模式为i+j＝2的情况下，确定的第一相乘数据组的数量即为1。此时，可以确定第一乘法指令所包括的第一乘法扩展指令的第一数量为2，第二乘法扩展指令的第二数量为2。In one example, for the schematic diagram of the first multiplication instruction shown in FIG13, it can be determined that when the value of n is 2, the first data combination pattern can be: i+j=4, i+j=3, i+j=2, and at this time, the number of the first data combination patterns is 3. When the first data combination pattern is i+j=4, the number of the first multiplication data groups determined is 1; when the first data combination pattern is i+j=3, the number of the first multiplication data groups determined is 2; when the first data combination pattern is i+j=2, the number of the first multiplication data groups determined is 1. At this time, it can be determined that the first number of the first multiplication extension instructions included in the first multiplication instruction is 2, and the second number of the second multiplication extension instructions is 2.

一个示例中，基于目标字长、第一乘法扩展指令、第二乘法扩展指令、第一置零扩展指令、第二置零扩展指令和数据保持扩展指令，确定第二乘法指令，具体包括如下步骤：In one example, determining the second multiplication instruction based on the target word length, the first multiplication extension instruction, the second multiplication extension instruction, the first zero extension instruction, the second zero extension instruction and the data retention extension instruction specifically includes the following steps:

首先，基于单字长所对应的数据长度和目标字长，对被乘数和乘数进行切分处理，得到被乘数对应的多个第三子数据和乘数对应的多个第四子数据。First, based on the data length corresponding to the single word length and the target word length, the multiplicand and the multiplier are segmented to obtain a plurality of third sub-data corresponding to the multiplicand and a plurality of fourth sub-data corresponding to the multiplier.

一个示例中，可以分别从被乘数和乘数的最低位开始，以单字长所对应的长度为步长进行切分处理，得到被乘数对应的多个第三子数据和乘数对应的多个第四子数据。In one example, the multiplicand and the multiplier can be segmented starting from the least significant bit of the multiplicand and the multiplier with the length corresponding to the single word length as a step length to obtain a plurality of third sub-data corresponding to the multiplicand and a plurality of fourth sub-data corresponding to the multiplier.

然后，确定与第二乘法指令相匹配的第二数据组合模式；并基于相匹配的第二数据组合模式，确定多个第二相乘数据组。Then, a second data combination pattern matching the second multiplication instruction is determined; and based on the matching second data combination pattern, a plurality of second multiplication data groups are determined.

其中，第二相乘数据组用于指示需要进行相乘计算的第三子数据和第四子数据。The second multiplication data group is used to indicate the third sub-data and the fourth sub-data that need to be multiplied.

一个示例中，第二数据组合模式用于指示第三子数据和第四子数据进行运算处理时的组合方式，例如，假设，第三子数据记为x_i，其中，i表示被乘数中的第i个第三子数据。假设，第二子数据记为y_j，其中，j表示乘数中的第j个第四子数据。那么，第一数据组合模式就可以为i+j＝2n，此时，可以将第i个第三子数据和第j个第四子数据，且i和j之和为2n的第三子数据和第四子数据进行组合，以进行运算处理。此时，基于第i个第三子数据和第j个第四子数据，且i和j之和为2n的第三子数据和第四子数据确定第二相乘数据组。In one example, the second data combination mode is used to indicate the combination mode of the third sub-data and the fourth sub-data when performing operation processing. For example, assume that the third sub-data is recorded as x _i , where i represents the i-th third sub-data in the multiplicand. Assume that the second sub-data is recorded as y _j , where j represents the j-th fourth sub-data in the multiplier. Then, the first data combination mode can be i+j=2n. At this time, the third sub-data and the fourth sub-data of the i-th and j-th sub-data, and the sum of i and j is 2n, can be combined for operation processing. At this time, the second multiplied data group is determined based on the third sub-data and the j-th fourth sub-data, and the sum of i and j is 2n.

一个示例中，与第二乘法指令相匹配的第二数据组合模式的数量可以为多个，例如，第二数据组合模式所确定的i+j的取值可以为2n，2n-1，…，n+1。In one example, there may be multiple second data combination patterns matching the second multiplication instruction. For example, the value of i+j determined by the second data combination pattern may be 2n, 2n-1, ..., n+1.

此时，基于第二数据组合模式所确定的第二相乘数据组的数量可以为一个，也可以为多个。例如，在i+j的取值为2n时，i和j的取值均为n，此时，第二相乘数据组的数量为1。在i+j的取值为3时，i为1，j为2，或者，i为2，j为1，此时，可以确定第二相乘数据组的数量为2。At this time, the number of the second multiplication data groups determined based on the second data combination mode can be one or more. For example, when the value of i+j is 2n, the values of i and j are both n, and at this time, the number of the second multiplication data groups is 1. When the value of i+j is 3, i is 1 and j is 2, or i is 2 and j is 1, and at this time, it can be determined that the number of the second multiplication data groups is 2.

接着，基于第二数据组合模式的数量，以及基于每个第二数据组合模式确定的第二相乘数据组的数量，确定第二乘法指令所包括的第一乘法扩展指令的第三数量和包括的第二乘法扩展指令的第四数量。Next, based on the number of second data combination patterns and the number of second multiplication data groups determined based on each second data combination pattern, a third number of first multiplication extension instructions and a fourth number of second multiplication extension instructions included in the second multiplication instruction are determined.

最后，基于第三数量个第一乘法扩展指令、第四数量个第二乘法扩展指令和第一置零扩展指令，确定第二乘法指令。Finally, a second multiplication instruction is determined based on the third number of first multiplication extension instructions, the fourth number of second multiplication extension instructions, and the first zeroing extension instruction.

一个示例中，图14为本申请实施例提供的一种n字长数据所对应的第二乘法指令的示意图，如图14所示，在对各第二数据组合模式(例如，图14所示的i+j＝2n，i+j＝2n-1，i+j＝2n-2，i+j＝2n-3，…，i+j＝n+1)下的各第二相乘数据组执行第一乘法扩展指令和第二乘法扩展指令时，在确定出首次进行运算的情况下，则对第二相乘数据组中的第三子数据和第四子数据执行第一乘法扩展指令。之后，则对各第一数据组合模式中下的各第一相乘数据组内的第三子数据和第四子数据先执行第二乘法扩展指令，再执行第一乘法扩展指令。再对各第二相乘数据组内的第三子数据和第四子数据执行完成第一乘法指令和第二乘法扩展指令之后，执行第一置零扩展指令。In one example, FIG14 is a schematic diagram of a second multiplication instruction corresponding to n-word length data provided by an embodiment of the present application. As shown in FIG14, when executing the first multiplication extension instruction and the second multiplication extension instruction for each second multiplication data group under each second data combination mode (for example, i+j=2n, i+j=2n-1, i+j=2n-2, i+j=2n-3, ..., i+j=n+1 as shown in FIG14), when determining that the operation is performed for the first time, the first multiplication extension instruction is executed for the third sub-data and the fourth sub-data in the second multiplication data group. After that, the second multiplication extension instruction is first executed for the third sub-data and the fourth sub-data in each first multiplication data group under each first data combination mode, and then the first multiplication extension instruction is executed. After the first multiplication instruction and the second multiplication extension instruction are executed for the third sub-data and the fourth sub-data in each second multiplication data group, the first zero extension instruction is executed.

此时，可以通过输出最低32bit的值、第2n-1个字的值、第2n-2个字的值、…第n+1个字的值、第1个字的值，来得到低位的乘法运算结果。At this time, the low-order multiplication result can be obtained by outputting the value of the lowest 32 bits, the value of the 2n-1th word, the value of the 2n-2th word, ... the value of the n+1th word, and the value of the 1st word.

一个示例中，针对图14所示的第一乘法指令的示意图可以确定，在n的取值为2的情况下，则，第二数据组合模式可以为：i+j＝4、i+j＝3，此时，第二数据组合模式的数量即为2。在第二数据组合模式为i+j＝4的情况下，确定的第二相乘数据组的数量即为1；在第一数据组合模式为i+j＝3的情况下，确定的第一相乘数据组的数量即为2。此时，可以确定第二乘法指令所包括的第一乘法扩展指令的第三数量为2，包括的第二乘法扩展指令的第四数量为1。In one example, for the schematic diagram of the first multiplication instruction shown in FIG14 , it can be determined that when the value of n is 2, the second data combination pattern can be: i+j=4, i+j=3, and at this time, the number of second data combination patterns is 2. When the second data combination pattern is i+j=4, the number of second multiplication data groups determined is 1; when the first data combination pattern is i+j=3, the number of first multiplication data groups determined is 2. At this time, it can be determined that the third number of first multiplication extension instructions included in the second multiplication instruction is 2, and the fourth number of second multiplication extension instructions included is 1.

一个示例中，本申请实施例可以基于第一乘法指令和第二乘法指令实现数据的乘法运算。此时，参见图15，图15为本申请实施例提供的一种n字长的数据实现乘法运算的流程示意图。如图15所示，通过同时输出第一乘法指令的输出结果和第二乘法指令的输出结果，即可得到最终的乘法运算结果，以实现乘法运算过程。In one example, the embodiment of the present application can implement a data multiplication operation based on a first multiplication instruction and a second multiplication instruction. At this time, referring to FIG. 15 , FIG. 15 is a flow chart of implementing a multiplication operation on n-word-length data provided by an embodiment of the present application. As shown in FIG. 15 , by simultaneously outputting the output result of the first multiplication instruction and the output result of the second multiplication instruction, the final multiplication operation result can be obtained to implement the multiplication operation process.

一个示例中，在待运算数据对应的数据字长为单字长的情况下，则预先设置的数据运算指令包括第三乘法扩展指令和第四乘法扩展指令。此时，基于待运算数据对应的数据字长和预先设置的数据运算指令，确定第一乘法指令和第二乘法指令时，可以直接将第三乘法扩展指令确定为第二乘法指令，并将第四乘法扩展指令确定为第一乘法指令。In one example, when the data word length corresponding to the data to be operated is a single word length, the pre-set data operation instruction includes a third multiplication extension instruction and a fourth multiplication extension instruction. At this time, based on the data word length corresponding to the data to be operated and the pre-set data operation instruction, when determining the first multiplication instruction and the second multiplication instruction, the third multiplication extension instruction can be directly determined as the second multiplication instruction, and the fourth multiplication extension instruction can be determined as the first multiplication instruction.

此时，在基于第一乘法指令和第二乘法指令实现待运算数据的乘法运算时，可以参见图16，图16为本申请实施例提供的一种单字长的数据实现乘法运算的流程示意图。如图16所示，先执行第二乘法指令(也即，第三乘法扩展指令)，再执行第一乘法指令(也即，第四乘法扩展指令)即可实现乘法运算过程。At this time, when implementing the multiplication operation of the data to be operated based on the first multiplication instruction and the second multiplication instruction, you can refer to Figure 16, which is a flow chart of implementing the multiplication operation of single-word-length data provided by an embodiment of the present application. As shown in Figure 16, the multiplication operation process can be implemented by first executing the second multiplication instruction (that is, the third multiplication extension instruction) and then executing the first multiplication instruction (that is, the fourth multiplication extension instruction).

一个示例中，第一乘法指令可以用于进行第一乘法运算，第二乘法指令可以用于第二乘法运算，此时，在确定出对待运算数据进行模乘运算的第一乘法指令和第二乘法指令之后，可以对待运算数据进行模乘运算，得到待运算数据对应的模乘运算结果。这里对模乘运算可以理解为蒙哥马利模乘运算，此时，在对待运算数据进行模乘运算时，可以参见下面所描述的步骤。In one example, the first multiplication instruction can be used to perform a first multiplication operation, and the second multiplication instruction can be used for a second multiplication operation. At this time, after determining the first multiplication instruction and the second multiplication instruction for performing modular multiplication operation on the data to be operated, the modular multiplication operation can be performed on the data to be operated to obtain a modular multiplication operation result corresponding to the data to be operated. The modular multiplication operation here can be understood as a Montgomery modular multiplication operation. At this time, when performing modular multiplication operation on the data to be operated, the steps described below can be referred to.

S1203、基于预先设置的数据运算指令，对被乘数和乘数进行第一乘法运算，得到表征高位的第一结果，并对被乘数和乘数进行第二乘法运算，得到表征低位的第一结果。S1203. Based on a preset data operation instruction, perform a first multiplication operation on the multiplicand and the multiplier to obtain a first result representing a high bit, and perform a second multiplication operation on the multiplicand and the multiplier to obtain a first result representing a low bit.

S1204、对表征低位的第一结果和第一中间数据进行第二乘法运算，得到表征低位的第二结果。S1204 , performing a second multiplication operation on the first result representing the low bits and the first intermediate data to obtain a second result representing the low bits.

S1205、对表征低位的第二结果和模数据进行第一乘法运算，得到表征高位的第三结果，并对表征低位的第二结果和模数据进行第二乘法运算，得到表征低位的第三结果。S1205, performing a first multiplication operation on the second result representing the lower bits and the modulus data to obtain a third result representing the higher bits, and performing a second multiplication operation on the second result representing the lower bits and the modulus data to obtain a third result representing the lower bits.

S1206、对表征低位的第一结果和表征低位的第三结果进行相加计算，得到第一求和结果。S1206: Add the first result representing the lower bits and the third result representing the lower bits to obtain a first summation result.

其中，第一求和结果中包括第一进位数据和第一求和数据。The first summation result includes first carry data and first summation data.

S1207对表征高位的第一结果、表征高位的第三结果和第一进位数据进行相加计算，得到第二求和结果。S1207 performs an addition calculation on the first result representing the high bit, the third result representing the high bit, and the first carry data to obtain a second sum result.

其中，第二求和结果中包括第二进位数据和第二求和数据。The second summation result includes second carry data and second summation data.

S1208、若基于第二进位数据确定出发生进位，或，确定出第二求和数据大于或者等于模数据，则对第二求和数据、第二中间数据和第二目标数值进行相加计算，得到第三求和结果；并将第三求和结果中所包括的第三求和数据，确定为待运算数据对应的模乘运算结果。S1208. If a carry is determined to have occurred based on the second carry data, or if it is determined that the second sum data is greater than or equal to the modulus data, the second sum data, the second intermediate data and the second target value are added to obtain a third sum result; and the third sum data included in the third sum result is determined as the modulus multiplication result corresponding to the data to be calculated.

一个示例中，第二目标数值可以用于指示对第二求和数据和第二中间数据进行相加计算时所对应需要相加的进位数据，此时，该第二目标数值可以为数值1。In an example, the second target value may be used to indicate the carry data that needs to be added when the second sum data and the second intermediate data are added. In this case, the second target value may be 1.

S1209、否则，将第二求和数据确定为待运算数据对应的模乘运算结果。S1209: Otherwise, determine the second summed data as the modular multiplication result corresponding to the data to be calculated.

一个示例中，图17为本申请实施例提供的一种蒙哥马利模乘运算的运算流程示意图，如图17所示，可以响应于蒙哥马利模乘运算指令，获取待计算数据，包括被乘数x、乘数y和模数m，也即图17所示的输入x，y，m。然后，基于待计算数据，确定预计算数据，包括模幂数据R、第一中间数据w和第二中间数据nm，也即，图17所示的确定预计算数据R、w、nm。其中，R＝2^N满足2^N-1<m≤2^N；w＝-m^-1mod R；nm＝～m mod R。In one example, FIG17 is a schematic diagram of a Montgomery modular multiplication operation flow provided by an embodiment of the present application. As shown in FIG17, the data to be calculated can be obtained in response to the Montgomery modular multiplication operation instruction, including the multiplicand x, the multiplier y and the modulus m, that is, the input x, y, m shown in FIG17. Then, based on the data to be calculated, the pre-calculated data is determined, including the modular exponentiation data R, the first intermediate data w and the second intermediate data nm, that is, the pre-calculated data R, w, nm are determined as shown in FIG17. Wherein, R= ^2N satisfies ^2N-1 <^m≤2N; w=-m ^-1 mod R; nm=~m mod R.

之后，可以计算被乘数x和乘数y的乘积，并将高位的第一结果存入mult_div_xy，低位的第一结果存入mult_mod_xy。此时，上述过程可以表示为：计算mult_xy＝Mult(x,y)＝mult_div_xy||mult_mod_xy，其中，“||”表示连接符，例如，01||10＝0110。其中，mult_div_xy为进行第一乘法运算得到的结果，mult_mod_xy为进行第二乘法运算得到的结果。After that, the product of the multiplicand x and the multiplier y can be calculated, and the first result of the high order is stored in mult_div_xy, and the first result of the low order is stored in mult_mod_xy. At this time, the above process can be expressed as: calculate mult_xy = Mult(x, y) = mult_div_xy||mult_mod_xy, where "||" represents a connector, for example, 01||10 = 0110. Wherein, mult_div_xy is the result obtained by performing the first multiplication operation, and mult_mod_xy is the result obtained by performing the second multiplication operation.

接着，对mult_mod_xy和第一中间数据w进行第二乘法运算，得到第二结果，并将第二结果存入t中，此时，上述过程可以表示为：计算t＝MultMod(mult_mod_xy,w)。Next, a second multiplication operation is performed on mult_mod_xy and the first intermediate data w to obtain a second result, and the second result is stored in t. At this time, the above process can be expressed as: calculate t=MultMod(mult_mod_xy,w).

然后，计算t和模数m的乘积，并将高位的第三结果存入mult_div_tm，低位的第三结果存入mult_mod_tm，此时，上述过程可以表示为：计算mult_tm＝Mult(t,m)＝mult_div_tm||mult_mod_tm。Then, the product of t and the modulus m is calculated, and the third result of the upper order is stored in mult_div_tm, and the third result of the lower order is stored in mult_mod_tm. At this time, the above process can be expressed as: calculate mult_tm=Mult(t,m)=mult_div_tm||mult_mod_tm.

然后，对低位的第一结果mult_mod_xy和低位的第三结果进行模加运算，并设置该模加运算的进位数据输入为0。进行模加运算处理之后，得到的最高位数据(也即，第一进位数据)存入cout1，并将剩余数据(也即第一求和数据)存入drop1，此时，上述过程可以表示为：计算(drop1,cout1)＝Add(mult_mod_xy,mult_mod_tm,0)。Then, a modular addition operation is performed on the first result mult_mod_xy of the lower order and the third result of the lower order, and the carry data input of the modular addition operation is set to 0. After the modular addition operation, the highest order data (that is, the first carry data) obtained is stored in cout1, and the remaining data (that is, the first sum data) is stored in drop1. At this time, the above process can be expressed as: calculation (drop1, cout1) = Add (mult_mod_xy, mult_mod_tm, 0).

接着，对高位的第一结果mult_div_xy和高位的第三结果mult_div_tm进行模加运算，并设置该模加运算的进位数据输入为cout1。进行模加运算处理之后，得到的最高位数据(也即，第二进位数据)存入cout2，并将剩余数据(也即第二求和数据)存入res，此时，上述过程可以表示为：计算(res,cout2)＝Add(mult_div_xy,mult_div_tm,cout1)。Next, a modular addition operation is performed on the high-order first result mult_div_xy and the high-order third result mult_div_tm, and the carry data input of the modular addition operation is set to cout1. After the modular addition operation, the highest-order data (that is, the second carry data) obtained is stored in cout2, and the remaining data (that is, the second sum data) is stored in res. At this time, the above process can be expressed as: calculation (res, cout2) = Add (mult_div_xy, mult_div_tm, cout1).

之后，就可以判断是否满足cout2＝1或res≥m。Afterwards, it can be determined whether cout2=1 or res≥m is satisfied.

若判断为“是”，则对第二求和数据res和第二中间数据nm进行模加运算，并设置该模加运算的进位数据输入为1。在进行模加运算之后，可以将最高位数据存入drop2，并将剩余数据(也即第三求和数据)存入z。此时，此过程可以表示为：计算(z,drop2)＝Add(res,nm,1)。此时，可以将z(也即第三求和数据)所对应的数据确定为蒙哥马利模乘运算的运算结果。If the judgment is "yes", a modular addition operation is performed on the second sum data res and the second intermediate data nm, and the carry data input of the modular addition operation is set to 1. After the modular addition operation, the highest bit data can be stored in drop2, and the remaining data (that is, the third sum data) can be stored in z. At this time, this process can be expressed as: calculation (z, drop2) = Add (res, nm, 1). At this time, the data corresponding to z (that is, the third sum data) can be determined as the operation result of the Montgomery modular multiplication operation.

否则，将res(也即第二求和数据)所对应的数据确定为蒙哥马利模乘运算的运算结果。Otherwise, the data corresponding to res (ie, the second summed data) is determined as the result of the Montgomery modular multiplication operation.

上述实施方式中，可以设计模乘运算装置以及模乘运算装置所能实现的数据运算指令，来实现模乘运算，实现了通过低成本的硬件运算单元的设计，来使模乘运算的计算速度和计算性能得到显著提升，拓宽了模乘运算应用的鲁棒性。In the above implementation mode, a modular multiplication operation device and data operation instructions that can be implemented by the modular multiplication operation device can be designed to implement modular multiplication operations, thereby achieving a significant improvement in the computational speed and performance of modular multiplication operations through the design of low-cost hardware operation units, thereby broadening the robustness of modular multiplication operation applications.

一个示例中，在基于上述模乘运算方法和模乘运算装置进行蒙哥马利模乘运算之后，与传统的基于软件算法实现蒙哥马利模乘运算的方法相比In one example, after performing Montgomery modular multiplication based on the modular multiplication method and modular multiplication device, compared with the conventional method of implementing Montgomery modular multiplication based on software algorithm,

一个示例中，本申请实施例在RISC-V仿真实验环境下，对优化前的模乘运算，以及本申请提供的基于数据运算指令和模乘运算装置实现的模乘运算方法，进行了运算性能评估，评估结果参见图18。In one example, the embodiment of the present application performed an operation performance evaluation on the modular multiplication operation before optimization and the modular multiplication operation method implemented by the data operation instruction and the modular multiplication operation device provided by the present application in the RISC-V simulation experimental environment. The evaluation results are shown in Figure 18.

图18为本申请实施例提供的一种模乘运算性能评估结果示意图，如图18所示，模乘运算优化前，执行一次模乘运算的运算性能为3543cycles。而在使用本申请实施例提供的模乘运算方法进行优化后，执行一次模乘运算的运算性能为1272cycles。由此可知，优化前后，模乘运算性能提升约64％，模乘运算的运算性能得到了显著的提升。FIG18 is a schematic diagram of a modular multiplication operation performance evaluation result provided by an embodiment of the present application. As shown in FIG18 , before the modular multiplication operation is optimized, the operation performance of executing a modular multiplication operation is 3543 cycles. After optimization using the modular multiplication operation method provided by an embodiment of the present application, the operation performance of executing a modular multiplication operation is 1272 cycles. It can be seen that before and after the optimization, the modular multiplication operation performance is improved by about 64%, and the operation performance of the modular multiplication operation is significantly improved.

发明人还发现，在根据本申请实施例提供的模乘运算方法进行椭圆曲线密码算法的计算时，使椭圆曲线密码算法的性能提升约24％。The inventors also found that when the elliptic curve cryptographic algorithm is calculated using the modular multiplication method provided in the embodiment of the present application, the performance of the elliptic curve cryptographic algorithm is improved by approximately 24%.

图19为本申请实施例提供的一种芯片的示意图，如图19所示，该芯片包括上述的模乘运算装置。FIG19 is a schematic diagram of a chip provided in an embodiment of the present application. As shown in FIG19 , the chip includes the above-mentioned modular multiplication operation device.

图20为本申请实施例提供的一种板卡的示意图，如图20所示，该板卡包括图19所示的芯片。FIG20 is a schematic diagram of a board provided in an embodiment of the present application. As shown in FIG20 , the board includes the chip shown in FIG19 .

图21为本申请实施例提供的一种车载系统的示意图，如图21所示，该车载系统包括图19所示的芯片或者图20所示的板卡。Figure 21 is a schematic diagram of a vehicle-mounted system provided in an embodiment of the present application. As shown in Figure 21, the vehicle-mounted system includes the chip shown in Figure 19 or the board shown in Figure 20.

在本申请所提供的几个实施例中，应该理解到，所揭露的装置和方法，可以通过其它的方式实现。例如，以上所描述的装置实施例仅是示意性的，例如，模块的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个模块可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或模块的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided in the present application, it should be understood that the disclosed devices and methods can be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of modules is only a logical function division. There may be other division methods in actual implementation. For example, multiple modules can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or modules, which can be electrical, mechanical or other forms.

其中，各个模块可以是物理上分开的，例如安装于一个的设备的不同位置，或者安装于不同的设备上，或者分布到多个网络单元上，或者分布到多个处理器上。各个模块也可以是集成在一起的，例如，安装于同一个设备中，或者，集成在一套代码中。各个模块可以以硬件的形式存在，或者也可以以软件的形式存在，或者也可以采用软件加硬件的形式实现。本申请可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。Among them, each module can be physically separated, for example, installed in different locations of a device, or installed on different devices, or distributed on multiple network units, or distributed on multiple processors. Each module can also be integrated together, for example, installed in the same device, or integrated in a set of codes. Each module can exist in the form of hardware, or can also exist in the form of software, or can also be implemented in the form of software plus hardware. The present application can select some or all of the modules according to actual needs to achieve the purpose of the present embodiment.

当各个模块以软件功能模块的形式实现的集成的模块，可以存储在一个计算机可读取存储介质中。上述软件功能模块存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)或处理器执行本申请各个实施例方法的部分步骤。When each module is implemented as an integrated module in the form of a software function module, it can be stored in a computer-readable storage medium. The above-mentioned software function module is stored in a storage medium, including a number of instructions for a computer device (which can be a personal computer, a server, or a network device, etc.) or a processor to perform some steps of the methods of each embodiment of the present application.

应该理解的是，虽然上述实施例中的流程图中的各个步骤按照箭头的指示依次显示，但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明，这些步骤的执行并没有严格的顺序限制，其可以以其他的顺序执行。而且，图中的至少一部分步骤可以包括多个子步骤或者多个阶段，这些子步骤或者阶段并不必然是在同一时刻执行完成，而是可以在不同的时刻执行，其执行顺序也不必然是依次进行，而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although the various steps in the flowchart in the above-described embodiment are displayed in sequence according to the indication of the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless there is a clear description in this article, the execution of these steps is not strictly limited in order, and they can be executed in other orders. Moreover, at least a portion of the steps in the figure may include a plurality of sub-steps or a plurality of stages, and these sub-steps or stages are not necessarily executed at the same time, but can be executed at different times, and their execution order is not necessarily carried out in sequence, but can be executed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

本领域技术人员在考虑说明书及实践这里公开的发明后，将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化，这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的，本申请的真正范围和精神由下面的权利要求书指出。Those skilled in the art will readily appreciate other embodiments of the present application after considering the specification and practicing the invention disclosed herein. The present application is intended to cover any modification, use or adaptation of the present application, which follows the general principles of the present application and includes common knowledge or customary techniques in the art that are not disclosed in the present application. The specification and examples are intended to be exemplary only, and the true scope and spirit of the present application are indicated by the following claims.

应当理解的是，本申请并不局限于上面已经描述并在附图中示出的精确结构，并且可以在不脱离其范围进行各种修改和改变。本申请的范围仅由所附的权利要求书来限制。It should be understood that the present application is not limited to the precise structures that have been described above and shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present application is limited only by the appended claims.

Claims

1. A modular multiplication operation device, characterized in that the operation device comprises: a processor and a data operator; wherein the data operator comprises a multiplier, an accumulator, a first register, a second register, a third register, a first multiplexer, a second multiplexer, a third multiplexer and a fourth multiplexer; the processor is respectively connected to the multiplier, the first multiplexer, the second multiplexer and the third multiplexer; the first multiplexer is connected to the first register; the first register is respectively connected to the first multiplexer, the second multiplexer and the fourth multiplexer; the second multiplexer is respectively connected to the first register, the second register and the processor; the second register is respectively connected to the accumulator and the second multiplexer; the accumulator is respectively connected to the third multiplexer and the fourth multiplexer; the third multiplexer is connected to the third register; the third register is respectively connected to the accumulator and the fourth multiplexer;

The processor is used to obtain target data matching the data operation instruction after receiving the data operation instruction; and transmit the target data to the data operator;

The data operator is used to receive the target data and perform operation processing on the target data to obtain a data operation result; wherein the data operation result is used to indicate a modular multiplication operation result.

2. The computing device according to claim 1, characterized in that the data operation instruction represents a first multiplication extension instruction; wherein the first multiplication extension instruction represents an instruction for performing a multiplication operation and summing; and the target data includes a first multiplier and a first multiplicand;

The processor is configured to, after receiving the first multiplication extension instruction, transmit the first multiplier and the first multiplicand to the multiplier, so as to perform a multiplication operation on the first multiplier and the first multiplicand based on the multiplier to obtain a first multiplication operation result; transmit and save the first multiplication operation result to the first register through the first multiplexer;

The first register is used to receive the first multiplication result, and transmit and save the first multiplication result to the second register through the second multiplexer;

The second register is used to obtain the data stored in the third register, and transmit the data stored in the third register and the received first multiplication result to the accumulator;

The accumulator is used to perform an addition operation on the data stored in the third register and the first multiplication result to obtain a first addition result; and save the first addition result to the third register through the third multiplexer.

3. The computing device according to claim 1, characterized in that the data operation instruction represents a second multiplication extension instruction; wherein the second multiplication extension instruction represents an instruction for performing multiplication and summing, retaining the high-order operation result, and outputting the low-order operation result; the target data includes a second multiplier and a second multiplicand;

The processor is configured to, after receiving the second multiplication extension instruction, transmit the second multiplier and the second multiplicand to the multiplier, so as to perform a multiplication operation on the second multiplier and the second multiplicand based on the multiplier to obtain a second multiplication operation result; transmit the second multiplication operation result to the first register through the first multiplexer and save it;

The first register is used to receive the second multiplication result, and transmit and save the second multiplication result to the second register through the second multiplexer;

The second register is used to transfer the second multiplication result to the accumulator;

The accumulator is used to obtain the data stored in the third register, and perform an addition operation on the data stored in the third register and the received second multiplication operation result to obtain a second addition operation result;

The processor is further configured to obtain and output the low-order data in the second addition operation result through the fourth multiplexer; and to save the high-order data in the second addition operation result into the third register through the third multiplexer.

4. The computing device according to claim 1, characterized in that the data operation instruction represents a third multiplication extension instruction; wherein the third multiplication extension instruction represents an instruction for outputting a low-order multiplication result after performing a multiplication operation; and the target data includes a third multiplier and a third multiplicand;

The processor is configured to transmit the third multiplier and the third multiplicand to the multiplier after receiving the third multiplication extension instruction;

The multiplier is used to receive the third multiplier and the third multiplicand; perform a multiplication operation on the third multiplier and the third multiplicand to obtain a third multiplication operation result; transmit the third multiplication operation result through the first multiplexer and save it in the first register;

The processor is further configured to select and output low-order data in the third multiplication result based on the fourth multiplexer.

5. The computing device according to claim 1, characterized in that the data operation instruction represents a fourth multiplication extension instruction; wherein the fourth multiplication extension instruction represents an instruction for outputting a high-order multiplication result after performing a multiplication operation; and the target data includes a fourth multiplier, a fourth multiplicand and a first target value;

The processor is configured to transmit the fourth multiplier and the fourth multiplicand to the multiplier after receiving the fourth multiplication extension instruction;

The multiplier is configured to receive the fourth multiplier and the fourth multiplicand; perform a multiplication operation on the fourth multiplier and the fourth multiplicand to obtain a fourth multiplication operation result; transmit the fourth multiplication operation result through the first multiplexer and save it in the first register;

The first register is used to transmit the fourth multiplication result to the second register through the second multiplexer;

The processor is also used to output the high-order data in the fourth multiplication result stored in the second register through the fourth multiplexer; transmit the first target value to the second register through the second multiplexer to set the second register to zero; transmit the first target value to the third register through the third multiplexer to set the third register to zero.

6. The computing device according to claim 1, characterized in that the data operation instruction represents a first zeroing extension instruction; wherein the first zeroing extension instruction represents an instruction to output low-order data and then set it to zero; the target data includes a summation result after the accumulator operation and a first target value;

The processor is used to select and output the low-order data in the summation result after the accumulator operation based on the fourth multiplexer; transmit the first target value to the second register through the second multiplexer to set the second register to zero; transmit the first target value to the third register through the third multiplexer to set the third register to zero.

7. The computing device according to claim 1, characterized in that the data computing instruction represents a second zeroing extension instruction; wherein the second zeroing extension instruction represents an instruction to output high-order data and then set it to zero; the target data includes the data stored in the third register and the first target value;

The processor is used to output the high-order data in the data stored in the third register based on the fourth multiplexer; transmit the first target value to the second register through the second multiplexer to set the second register to zero; transmit the first target value to the third register through the third multiplexer to set the third register to zero.

8. The computing device according to claim 1, characterized in that the data computing instruction represents a data retention extension instruction; wherein the data retention extension instruction represents an instruction for retaining the remaining data after outputting the low-order data; the target data includes the data stored in the second register and the data stored in the third register;

The accumulator is used to receive the data stored in the second register and the third register; perform an addition operation on the data stored in the second register and the data stored in the third register to obtain a third addition operation result;

The processor is used to output the low-order data in the third addition result through the fourth multiplexer after the accumulator obtains the third addition result; and transmit and save the high-order data in the third addition result to the third register through the third multiplexer.

9. A modular multiplication method, characterized in that it is applied to a processor; the method comprises:

In response to the modular multiplication operation instruction, the data to be operated is obtained; wherein the data to be operated includes the data to be calculated and the pre-calculated data; the data to be calculated represents the data that needs to be operated on for modular multiplication; the pre-calculated data represents the intermediate data relied on in the process of modular multiplication operation; the pre-calculated data is the data determined based on the data to be calculated;

Based on a preset data operation instruction, a modular multiplication operation is performed on the data to be operated to obtain a modular multiplication operation result corresponding to the data to be operated; wherein the preset data operation instruction is implemented based on the modular multiplication operation device described in any one of claims 1 to 6 above.

10. The method according to claim 9, characterized in that the data to be calculated includes a multiplicand, a multiplier and modular data; the pre-calculated data includes modular exponentiation data determined based on the modular data, first intermediate data and second intermediate data determined based on the modular exponentiation data; the method of performing modular multiplication operation on the data to be calculated based on the preset data operation instruction to obtain the modular multiplication operation result corresponding to the data to be calculated comprises:

Based on the preset data operation instruction, a first multiplication operation is performed on the multiplicand and the multiplier to obtain a first result representing a high bit, and a second multiplication operation is performed on the multiplicand and the multiplier to obtain a first result representing a low bit;

Performing the second multiplication operation on the first result representing the low bits and the first intermediate data to obtain a second result representing the low bits;

Performing the first multiplication operation on the second result representing the lower bits and the modulo data to obtain a third result representing the higher bits, and performing the second multiplication operation on the second result representing the lower bits and the modulo data to obtain a third result representing the lower bits;

Adding the first result representing the lower bit and the third result representing the lower bit to obtain a first sum result; wherein the first sum result includes first carry data and first sum data;

Add the first result representing the high bit, the third result representing the high bit, and the first carry data to obtain a second sum result; wherein the second sum result includes the second carry data and the second sum data;

Based on the second carry data and the second sum data, a modular multiplication result corresponding to the data to be calculated is determined.

11. The method according to claim 10, characterized in that the step of determining the modular multiplication result corresponding to the data to be operated based on the second carry data and the second sum data comprises:

If it is determined based on the second carry data that a carry occurs, or it is determined that the second sum data is greater than or equal to the modulus data, the second sum data, the second intermediate data and the second target value are added to obtain a third sum result; and the third sum data included in the third sum result is determined as the modular multiplication operation result corresponding to the data to be operated;

Otherwise, the second summed data is determined as the modular multiplication result corresponding to the data to be calculated.

12. The method according to claim 10, characterized in that before performing modular multiplication operation on the data to be operated based on a preset data operation instruction to obtain a modular multiplication operation result corresponding to the data to be operated, the method further comprises:

Based on the data word length corresponding to the data to be operated and the preset data operation instruction, a first multiplication instruction and a second multiplication instruction are determined; wherein the first multiplication instruction is used to implement the first multiplication operation; and the second multiplication instruction is used to implement the second multiplication operation.

13. The method according to claim 12, characterized in that the data word length corresponding to the data to be operated is a multi-word length; the preset data operation instruction includes a first multiplication extension instruction, a second multiplication extension instruction, a first zero extension instruction, a second zero extension instruction and a data retention extension instruction; the determining the first multiplication instruction and the second multiplication instruction based on the data word length corresponding to the data to be operated and the preset data operation instruction comprises:

Determine the target word length corresponding to the multiplicand and the multiplier in the data to be operated; wherein the target word length represents the data word length of the multiplicand or the multiplier;

The first multiplication instruction and the second multiplication instruction are determined based on the target word length, the first multiplication extension instruction, the second multiplication extension instruction, the first set-zero extension instruction, the second set-zero extension instruction and the data retention extension instruction.

14. The method according to claim 13, characterized in that determining the first multiplication instruction based on the target word length, the first multiplication extension instruction, the second multiplication extension instruction, the first zeroing extension instruction, the second zeroing extension instruction and the data retention extension instruction comprises:

Based on the data length corresponding to the single word length and the target word length, the multiplicand and the multiplier are segmented to obtain a plurality of first sub-data corresponding to the multiplicand and a plurality of second sub-data corresponding to the multiplier;

Determine a first data combination pattern that matches the first multiplication instruction; and determine a plurality of first multiplication data groups based on the matched first data combination pattern; wherein the first multiplication data groups are used to indicate first sub-data and second sub-data that need to be multiplied;

Determining a first number of first multiplication extension instructions and a second number of second multiplication extension instructions included in the first multiplication instruction based on the number of the first data combination patterns and the number of first multiplication data groups determined based on each of the first data combination patterns;

The first multiplication instruction is determined based on the first number of first multiplication extension instructions, the second number of second multiplication extension instructions, the data retention extension instruction and the second zeroing extension instruction.

15. The method according to claim 13, characterized in that determining the second multiplication instruction based on the target word length, the first multiplication extension instruction, the second multiplication extension instruction, the first zeroing extension instruction, the second zeroing extension instruction and the data retention extension instruction comprises:

Based on the data length corresponding to the single word length and the target word length, the multiplicand and the multiplier are segmented to obtain a plurality of third sub-data corresponding to the multiplicand and a plurality of fourth sub-data corresponding to the multiplier;

Determine a second data combination pattern that matches the second multiplication instruction; and determine a plurality of second multiplication data groups based on the matched second data combination pattern; wherein the second multiplication data groups are used to indicate third sub-data and fourth sub-data that need to be multiplied;

Determining a third number of first multiplication extension instructions and a fourth number of second multiplication extension instructions included in the second multiplication instruction based on the number of the second data combination patterns and the number of second multiplication data groups determined based on each of the second data combination patterns;

The second multiplication instruction is determined based on the third number of first multiplication extension instructions, the fourth number of second multiplication extension instructions, and the first zeroing extension instruction.

16. The method according to claim 12, characterized in that the data word length corresponding to the data to be operated is a single word length; the preset data operation instruction includes a third multiplication extension instruction and a fourth multiplication extension instruction; the determining the first multiplication instruction and the second multiplication instruction based on the data word length corresponding to the data to be operated and the preset data operation instruction comprises:

The third multiplication extension instruction is determined as the second multiplication instruction, and the fourth multiplication extension instruction is determined as the first multiplication instruction.

17. The method according to claim 9, characterized in that the method is used to calculate elliptic curve cryptography.

18. A chip, characterized in that the chip comprises the modular multiplication operation device according to any one of claims 1 to 8.

19. A board, characterized in that the board comprises the chip according to claim 18.

20. An in-vehicle system, characterized in that the in-vehicle system comprises the chip according to claim 18 or the board according to claim 19.