WO2024032027A1 - Method for reducing power consumption, and processor, electronic device and storage medium - Google Patents

Method for reducing power consumption, and processor, electronic device and storage medium Download PDF

Info

Publication number
WO2024032027A1
WO2024032027A1 PCT/CN2023/089977 CN2023089977W WO2024032027A1 WO 2024032027 A1 WO2024032027 A1 WO 2024032027A1 CN 2023089977 W CN2023089977 W CN 2023089977W WO 2024032027 A1 WO2024032027 A1 WO 2024032027A1
Authority
WO
WIPO (PCT)
Prior art keywords
calculated
target
data
processed
instruction
Prior art date
Application number
PCT/CN2023/089977
Other languages
French (fr)
Chinese (zh)
Inventor
鲍道川
尹磊祖
李高山
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2024032027A1 publication Critical patent/WO2024032027A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/329Power saving characterised by the action undertaken by task scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to processor technology, and in particular to a method for reducing power consumption, a processor, an electronic device and a storage medium.
  • Embodiments of the present application provide a method, a processor, an electronic device, and a storage medium for reducing power consumption.
  • the method for reducing power consumption in the embodiment of the present application includes: determining the part to be calculated of the data to be processed according to the target instruction, wherein the data to be processed includes the part to be calculated and the non-computed part; and performing the operation on the part to be calculated. Operation to obtain output data.
  • the processor in the embodiment of the present application includes a preprocessing module and an operation module.
  • the preprocessing module is used to obtain the parts to be calculated of the data to be processed according to the target instructions, where the data to be processed is divided into the parts to be calculated and the non-calculated parts;
  • the operation module is used to calculate the parts to be calculated. Perform operations on the parts to obtain output data.
  • the electronic device in the embodiment of the present application includes a housing and the processor described in the above embodiment.
  • the computer-readable storage medium has a computer program stored thereon.
  • the program is executed by a processor, the steps of the method for reducing power consumption described in the above embodiment are implemented.
  • Figure 1 is a schematic flowchart of a method for reducing power consumption in some embodiments of the present application
  • Figure 2 is a schematic diagram of a processor in some embodiments of the present application.
  • Figure 3 is a schematic diagram of an electronic device according to certain embodiments of the present application.
  • Figure 4 is a schematic flowchart of a method for reducing power consumption in some embodiments of the present application.
  • Figure 5 is a schematic diagram comparing methods for reducing power consumption and related technologies in certain embodiments of the present application.
  • Figure 6 is a schematic diagram comparing methods for reducing power consumption and related technologies in certain embodiments of the present application.
  • Figure 7 is a schematic flowchart of a method for reducing power consumption in some embodiments of the present application.
  • Figure 8 is a schematic diagram of a processor in some embodiments of the present application.
  • Figure 9 is a schematic flowchart of a method for reducing power consumption in some embodiments of the present application.
  • first and second are only used for descriptive purposes and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Thus, features defined as “first” and “second” may explicitly or implicitly include one or more of the described features. In the description of the embodiments of the present application, “plurality” means two or more, unless otherwise explicitly and specifically limited.
  • the method for reducing power consumption in the embodiment of this application includes:
  • 01 According to the target instruction, determine the part of the data to be processed that needs to be calculated, where the data to be processed includes the part that needs to be calculated and the part that is not calculated.
  • the target instruction provides the basis for determining the part to be calculated of the data to be processed.
  • the target instruction may not only include a part that provides a basis for determining the part to be calculated of the data to be processed, but also may include a part that provides a basis for performing operations on the part to be calculated, and the part to be calculated of the data to be processed can be determined according to the target instruction. That’s it, there are no specific restrictions in this application. For ease of understanding, an example is given below, taking the data to be processed as including a real number part and an imaginary number part.
  • the target instruction may include calculating the real number part, so as to determine the part to be calculated of the data to be processed according to the target instruction: Real number part; in another embodiment, the target instruction may include calculating the addition of the imaginary number part to determine the part to be calculated of the data to be processed as the imaginary number part according to the target instruction, and perform an addition operation on the imaginary number part according to the target instruction to obtain the output data.
  • the parts to be calculated and the non-calculated parts are different parts of the same data respectively, that is, the parts to be calculated and the non-calculated parts together constitute the same data to be processed.
  • the method of reducing power consumption in this application is to reduce power consumption by distinguishing parts of a certain data to be processed that do not require calculation.
  • the operations performed on the parts to be calculated may include but are not limited to addition, subtraction, multiplication, division and other operations, which are not specifically limited in this application.
  • the processor 30 in the embodiment of the present application includes a preprocessing module 31 and an operation module 32 .
  • the method of reducing power consumption of the present application can be implemented by the processor 30 of the embodiment of the present application, wherein step 01 can be implemented by the preprocessing module 31 and step 02 can be implemented by the computing module 32. That is to say, the preprocessing module 31 can According to the target instruction, the part to be calculated of the data to be processed is obtained, wherein the data to be processed is divided into a part to be calculated and a non-calculated part.
  • the operation module 32 can be used to perform operations on the parts to be calculated to obtain output data.
  • the data to be processed can be divided into parts to be calculated and non-calculated parts.
  • the parts to be calculated need to be calculated without calculating the non-calculated parts. , can avoid redundancy in calculation and achieve the effect of reducing power consumption.
  • the processor 30 may be applied to the electronic device 100 .
  • the electronic device 100 may include a smartphone, a tablet, a smart watch, a smart bracelet, and other devices, which are not specifically limited here.
  • the electronic device 100 in the embodiment of the present application is explained by taking a smartphone as an example, which should not be understood as a limitation of the present application.
  • Electronic device 100 also includes housing 50 .
  • power consumption can be reduced by reducing voltage, reducing frequency, power gating and other technical methods.
  • some processors are configured with corresponding registers to shield redundant calculations. However, configuring corresponding registers will bring many unnecessary operations, and additional registers will also bring unnecessary hardware overhead.
  • This application does not need to configure registers and avoids additional hardware overhead. This application can obtain the to-be-calculated part of the data to be processed according to the target instruction, thereby avoiding the calculation of the non-calculated part, shielding redundant calculations, and achieving the effect of reducing power consumption.
  • step 01 may be to obtain several parts to be calculated of the data to be processed according to the target instruction, and perform operations on each part to be calculated in step 02, which is not specifically limited in this application.
  • the target instructions include target calculation instructions, step 02, including:
  • the above steps can be implemented by the operation module 32. That is to say, the operation module 32 can be used to perform operations on the parts to be calculated according to the target calculation instructions to obtain output data.
  • the target instructions may include target calculation instructions, and the target instructions may also include other instructions.
  • the target calculation instruction can be understood as a target instruction that can provide a basis for obtaining the part to be calculated of the data to be processed, and Calculation instructions that provide a basis for operations on the parts to be calculated.
  • the data to be processed includes a real number part and an imaginary number part
  • the target instruction includes a target calculation instruction that adds the real number part of the data to be processed, then the target calculation instruction that adds the real number part of the data to be processed can be used. , obtain the real number part of the data to be processed; according to the target calculation instruction of adding the real number part of the data to be processed, perform an addition operation on the real number part to obtain the output data.
  • Step 01 includes:
  • the target calculation instruction obtain the real number part or the imaginary number part as the part to be calculated
  • the real number part or the imaginary number part is operated to obtain the output data.
  • obtaining the real number part or the imaginary number part as the part to be calculated can be implemented by the preprocessing module 31.
  • performing operations on the real number part or the imaginary number part, and obtaining the output data can be performed by The operation module 32 implements, that is to say, the preprocessing module 31 can be used to obtain the real part or the imaginary part as the part to be calculated according to the target calculation instruction.
  • the operation module 32 can be used to perform operations on the real number part or the imaginary number part according to the target calculation instruction to obtain output data.
  • the real number part or the imaginary number part that needs to be calculated can be obtained according to the target calculation instruction, and the parts that do not need to be calculated are shielded and redundant, thereby reducing power consumption.
  • the target calculation instructions may include but are not limited to calculation instructions such as addition of the real part, multiplication of the real part, subtraction of the real part, addition of the imaginary part, etc., which are not specifically limited in this application.
  • the real number part can be obtained according to the target instruction and the real number part can be operated; or the imaginary number part can be obtained according to the target instruction and the imaginary number part can be operated. There is no specific limitation in this application.
  • VR1 represents calculation instructions that do not include real part parameters and imaginary part parameters in related technologies.
  • VR1, VR2 and VR3 respectively represent different vector registers.
  • the real part and the imaginary part of each element in the vector register VR1 and the vector register VR2 are subtracted correspondingly, and the result is placed in the vector register VR3. That is, in related technologies, the imaginary part needs to be calculated together, resulting in increased power consumption.
  • VSUB.SC16.IMAG VR3 VR2 VR1 represents the target calculation instruction of this application.
  • the target calculation instruction includes real part parameters or imaginary part parameters.
  • IMAG refers to the imaginary part parameters
  • SUB refers to subtraction.
  • the expression method of the target calculation instruction is not limited to VSUB.SC16.IMAG VR3 VR2 VR1. There are many other expression methods of the target calculation instruction, and the specific expression method is not specifically limited in this application.
  • step 01 includes:
  • the target calculation instruction obtain at least one quadrant part among the plurality of quadrant parts as the part to be calculated
  • At least one quadrant part among the plurality of quadrant parts is operated to obtain output data.
  • obtaining at least one quadrant part among the several quadrant parts as the part to be calculated can be implemented by the preprocessing module 31, and according to the target calculation instruction, performing at least one quadrant part among the several quadrant parts.
  • the calculation and obtaining the output data can be implemented by the calculation module 32. That is to say, the pre-processing module 31 can be used to obtain at least one quadrant part among several quadrant parts as the part to be calculated according to the target calculation instruction.
  • the operation module 32 may be used to perform operation on at least one quadrant part among several quadrant parts according to the target calculation instruction to obtain output data.
  • the quadrants that need to be calculated can be obtained according to the target calculation instructions, and the parts that do not need to be calculated are shielded and redundant, thereby reducing power consumption.
  • the data to be processed may include two quadrants, three quadrants, four quadrants, etc., which are not specifically limited in this application.
  • the target calculation instructions may include but are not limited to addition in the first quadrant, subtraction in the second quadrant, etc., and are not specifically limited in this application. According to the target instruction, the quadrant that needs to be calculated can be obtained, and the operation can be performed on the quadrant that needs to be calculated.
  • VADD.S8 VR3 VR2 VR1 represents calculation instructions that do not include quadrant parameters in related technologies.
  • VR1, VR2 and VR3 are respectively represents different vector registers.
  • the quadrant data of each element in the vector register VR1 and the vector register VR2 are added correspondingly, and the result is placed in the vector register VR3. That is, in the related technology, the data of the other three quadrants need to be calculated together, resulting in a waste of power consumption.
  • VADD.S8.Q1 VR3 VR2 VR1 represents the target calculation instruction of this application.
  • the target calculation instruction includes the parameters of the first quadrant.
  • Q1 refers to the parameters of the first quadrant
  • ADD means addition.
  • the output data can be obtained by simply adding the data of the first quadrant of each element in VR1 and VR2, avoiding the power consumption caused by calculating other quadrants, and achieving the effect of reducing power consumption.
  • the expression method of target calculation instructions is not limited to VADD.S8.Q1 VR3 VR2 VR1. There are many other expression methods of target calculation instructions, and the specific expression methods are not specifically limited in this application.
  • step 021 includes:
  • the above steps can be implemented by the computing module 32. That is to say, the computing module 32 can be used to close the computing unit corresponding to the non-computing part when performing calculations on the part to be calculated.
  • the method of reducing power consumption also includes:
  • Step 01 including:
  • 011 According to the target mode and calculation instructions, obtain the valid bits of the data to be processed as the parts to be calculated.
  • the processor 30 also includes a processing module 33 , step 04 can be implemented by the processing module 33 , and step 011 can be implemented by the preprocessing module 31 , that is to say, the processing module 33 can be used for When the processor is in target mode and a calculation instruction is received, the calculation instruction is determined to be the target instruction.
  • the preprocessing module 31 can be used to obtain the valid bits of the data to be processed as the parts to be calculated according to the target mode and calculation instructions.
  • the target mode may include a low-precision mode, a half-precision mode, and other modes.
  • the target mode includes the accuracy requirements for the output data. It is understandable that in some application scenarios, the accuracy of the data to be processed exceeds the accuracy that meets the functional requirements.
  • the data to be processed is usually calculated according to the accuracy of the data to be processed, and then the calculation results are shifted and Processing such as saturation and truncation results in a waste of power consumption.
  • the data to be processed can be preprocessed before calculation according to the accuracy of the target mode, and the valid bits of the data to be processed are obtained as the parts to be calculated, thereby achieving the effect of reducing power consumption.
  • multiple target modes may be included, and the multiple target modes correspond to different accuracy requirements, so that the corresponding target mode can be determined according to the accuracy requirements. It is worth noting that whether the processor is in target mode can be adjusted according to application scenarios, processor load and other factors, and is not specifically limited in this application.
  • the data to be processed is 32-bit data
  • the 16-bit calculation accuracy can already meet the functional requirements.
  • the calculation instruction is determined as The target instruction, according to the target mode and calculation instruction, obtains the 16-bit valid bits of the data to be processed as the part to be calculated.
  • the 16-bit part to be calculated is obtained and passed through a 16-bit multiplication
  • the processor calculates the 16-bit parts to be calculated, so that one multiplication instruction can reduce power consumption by nearly 75% compared to using four 16-bit multipliers.
  • the processor when the processor is in target mode but does not receive a calculation instruction, the processor does not start calculation, that is, the processor does not start to obtain the data to be processed. Therefore, it is necessary to wait until the processor is in target mode and receives In the case of a calculation instruction, the step of obtaining the valid bits of the data to be processed as the part to be calculated is then performed.
  • the calculation instructions in this embodiment may include the above target calculation instructions.
  • the data to be processed includes a 32-bit real number part and a 32-bit imaginary number part.
  • the calculation instruction includes adding the real number part.
  • the target mode includes an accuracy requirement of 16 bits. Then, the data to be processed can be obtained according to the target mode and the calculation instruction. The high 16-bit significant bits of the real number part of the data are used as the parts to be calculated.
  • step 01 includes:
  • the above steps can be implemented by the preprocessing module 31, that is to say, the preprocessing module 31 can be used to load the part to be calculated.
  • the method of reducing power consumption also includes:
  • calculation instructions perform operations on the data to be processed and obtain the operation results
  • the target instruction obtain the part to be stored of the operation result, where the operation result is divided into the part to be stored and the non-storage part;
  • the processor 30 may also include a storage module, and the above steps may be implemented by the storage module. That is to say, the storage module may be used to perform operations on the data to be processed according to the computing instructions to obtain operation results; according to the target instructions , obtain the part to be stored of the operation result, where the operation result is divided into the part to be stored and the non-storage part; store the part to be stored and discard the non-storage part.
  • the storage module may be used to perform operations on the data to be processed according to the computing instructions to obtain operation results; according to the target instructions , obtain the part to be stored of the operation result, where the operation result is divided into the part to be stored and the non-storage part; store the part to be stored and discard the non-storage part.
  • the part to be stored can be obtained and the part to be stored can be avoided, thereby avoiding the power consumption caused by storing the non-storage part, thereby achieving the effect of reducing power consumption.
  • the operation result includes the first quadrant, the second quadrant, the third quadrant, and the fourth quadrant, and only the first quadrant needs to be stored.
  • the first quadrant can be obtained.
  • Quadrant part and store the first quadrant part discard the second quadrant part, the third quadrant part and the fourth quadrant part, and avoid the power consumption caused by storing the second quadrant part, the third quadrant part and the fourth quadrant part.
  • the computer-readable storage medium in the embodiment of the present application has a computer program stored thereon.
  • the program is executed by the processor, the steps of the method for reducing power consumption in any of the above embodiments are implemented.
  • 01 According to the target instruction, obtain the part to be calculated of the data to be processed, where the data to be processed is divided into the part to be calculated and the non-calculated part;
  • Computer program code includes computer program code.
  • Computer program code may be in the form of source code, Object code form, executable file or some intermediate form, etc.
  • Computer-readable storage media can include: any entity or device that can carry computer program code, recording media, USB flash drives, mobile hard drives, magnetic disks, optical disks, computer memory, read-only memory (ROM, Read-Only Memory), random access memory Access memory (RAM, Random Access Memory), and software distribution media, etc.
  • the processor can be a central processing unit, or other general-purpose processor, Digital Signal Processor (DSP), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (Field-Programmable Gate) Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • references to the terms “one embodiment,” “some embodiments,” “an example,” “specific examples,” or “some examples” or the like means that specific features are described in connection with the embodiment or example. , structures, materials or features are included in at least one embodiment or example of the present application. In this specification, the schematic expressions of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, those skilled in the art may combine and combine different embodiments or examples and features of different embodiments or examples described in this specification unless they are inconsistent with each other.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Power Sources (AREA)

Abstract

A method for reducing power consumption, and a processor (30), an electronic device (100) and a storage medium. The method for reducing power consumption comprises: according to a target instruction, determining a part to be calculated of data to be processed, wherein the data to be processed comprises the part to be calculated and a non-computational part; and calculating the part to be calculated, so as to obtain output data.

Description

降低功耗的方法、处理器、电子设备及存储介质Methods, processors, electronic devices and storage media for reducing power consumption
优先权信息priority information
本申请请求2022年08月11日向中国国家知识产权局提交的、专利申请号为202210960729.8的专利申请的优先权和权益,并且通过参照将其全文并入此处。This application requests the priority and rights of the patent application with patent application number 202210960729.8, which was submitted to the State Intellectual Property Office of China on August 11, 2022, and its full text is incorporated herein by reference.
技术领域Technical field
本申请涉及处理器技术,特别涉及一种降低功耗的方法、处理器、电子设备及存储介质。The present application relates to processor technology, and in particular to a method for reducing power consumption, a processor, an electronic device and a storage medium.
背景技术Background technique
随着集成电路设计水平和制造工艺的提高,处理器的性能、集成度越来越高,在保证芯片的性能的情况下,降低处理器的功耗成为降低芯片功耗的一个研究方向。With the improvement of integrated circuit design level and manufacturing technology, the performance and integration level of processors are getting higher and higher. While ensuring the performance of the chip, reducing the power consumption of the processor has become a research direction to reduce the power consumption of the chip.
发明内容Contents of the invention
本申请的实施方式提供了一种降低功耗的方法、处理器、电子设备及存储介质。Embodiments of the present application provide a method, a processor, an electronic device, and a storage medium for reducing power consumption.
本申请实施方式的降低功耗的方法包括:根据目标指令,确定待处理数据的待计算部位,其中,所述待处理数据包括所述待计算部位和非计算部位;对所述待计算部位进行运算,获得输出数据。The method for reducing power consumption in the embodiment of the present application includes: determining the part to be calculated of the data to be processed according to the target instruction, wherein the data to be processed includes the part to be calculated and the non-computed part; and performing the operation on the part to be calculated. Operation to obtain output data.
本申请实施方式的处理器包括预处理模块和运算模块。所述预处理模块用于根据目标指令,获取待处理数据的待计算部位,其中,所述待处理数据分为所述待计算部位和非计算部位;所述运算模块用于对所述待计算部位进行运算,获得输出数据。The processor in the embodiment of the present application includes a preprocessing module and an operation module. The preprocessing module is used to obtain the parts to be calculated of the data to be processed according to the target instructions, where the data to be processed is divided into the parts to be calculated and the non-calculated parts; the operation module is used to calculate the parts to be calculated. Perform operations on the parts to obtain output data.
本申请实施方式的电子设备包括壳体和上述实施方式所述的处理器。The electronic device in the embodiment of the present application includes a housing and the processor described in the above embodiment.
本申请实施方式的计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行的情况下,实现上述实施方式所述的降低功耗的方法的步骤。The computer-readable storage medium according to the embodiment of the present application has a computer program stored thereon. When the program is executed by a processor, the steps of the method for reducing power consumption described in the above embodiment are implemented.
本申请的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实践了解到。Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.
附图说明Description of drawings
本申请的上述和/或附加的方面和优点从结合下面附图对实施方式的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present application will become apparent and readily understood from the description of the embodiments in conjunction with the following drawings, in which:
图1是本申请某些实施方式的降低功耗的方法的流程示意图; Figure 1 is a schematic flowchart of a method for reducing power consumption in some embodiments of the present application;
图2是本申请某些实施方式的处理器的示意图;Figure 2 is a schematic diagram of a processor in some embodiments of the present application;
图3是本申请某些实施方式的电子设备的示意图;Figure 3 is a schematic diagram of an electronic device according to certain embodiments of the present application;
图4是本申请某些实施方式的降低功耗的方法的流程示意图;Figure 4 is a schematic flowchart of a method for reducing power consumption in some embodiments of the present application;
图5是本申请某些实施方式的降低功耗的方法与相关技术的对比示意图;Figure 5 is a schematic diagram comparing methods for reducing power consumption and related technologies in certain embodiments of the present application;
图6是本申请某些实施方式的降低功耗的方法与相关技术的对比示意图;Figure 6 is a schematic diagram comparing methods for reducing power consumption and related technologies in certain embodiments of the present application;
图7是本申请某些实施方式的降低功耗的方法的流程示意图;Figure 7 is a schematic flowchart of a method for reducing power consumption in some embodiments of the present application;
图8是本申请某些实施方式的处理器的示意图;Figure 8 is a schematic diagram of a processor in some embodiments of the present application;
图9是本申请某些实施方式的降低功耗的方法的流程示意图。Figure 9 is a schematic flowchart of a method for reducing power consumption in some embodiments of the present application.
具体实施方式Detailed ways
下面详细描述本申请的实施方式,所述实施方式的实施方式在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施方式是示例性的,仅用于解释本申请,而不能理解为对本申请的限制。Embodiments of the present application are described in detail below, embodiments of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals throughout represent the same or similar elements or elements with the same or similar functions. The embodiments described below with reference to the drawings are exemplary and are only used to explain the present application and cannot be understood as limiting the present application.
在本申请的实施方式的描述中,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个所述特征。在本申请的实施方式的描述中,“多个”的含义是两个或两个以上,除非另有明确具体的限定。In the description of the embodiments of the present application, the terms "first" and "second" are only used for descriptive purposes and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Thus, features defined as “first” and “second” may explicitly or implicitly include one or more of the described features. In the description of the embodiments of the present application, "plurality" means two or more, unless otherwise explicitly and specifically limited.
请参阅图1,本申请实施方式的降低功耗的方法包括:Please refer to Figure 1. The method for reducing power consumption in the embodiment of this application includes:
01:根据目标指令,确定待处理数据的待计算部位,其中,待处理数据包括待计算部位和非计算部位。01: According to the target instruction, determine the part of the data to be processed that needs to be calculated, where the data to be processed includes the part that needs to be calculated and the part that is not calculated.
目标指令为确定待处理数据的待计算部位提供依据。具体的,目标指令不仅可以包括为确定待处理数据的待计算部分提供依据的部分,目标指令还可以包括为对待计算部位进行运算提供依据的部分,能够根据目标指令确定待处理数据的待计算部位即可,本申请不做具体限制。为方便理解,下面进行举例说明,以待处理数据包括实数部和虚数部为例,在某个实施方式中,目标指令可以包括计算实数部,以根据目标指令确定待处理数据的待计算部位为实数部;在另一个实施方式中,目标指令可以包括计算虚数部相加,以根据目标指令确定待处理数据的待计算部位为虚数部,并根据目标指令对虚数部进行相加运算,获得输出数据。The target instruction provides the basis for determining the part to be calculated of the data to be processed. Specifically, the target instruction may not only include a part that provides a basis for determining the part to be calculated of the data to be processed, but also may include a part that provides a basis for performing operations on the part to be calculated, and the part to be calculated of the data to be processed can be determined according to the target instruction. That’s it, there are no specific restrictions in this application. For ease of understanding, an example is given below, taking the data to be processed as including a real number part and an imaginary number part. In a certain embodiment, the target instruction may include calculating the real number part, so as to determine the part to be calculated of the data to be processed according to the target instruction: Real number part; in another embodiment, the target instruction may include calculating the addition of the imaginary number part to determine the part to be calculated of the data to be processed as the imaginary number part according to the target instruction, and perform an addition operation on the imaginary number part according to the target instruction to obtain the output data.
待计算部位和非计算部位分别为同一数据的不同部分,即待计算部分和非计算部分共同构成了同一待处理数据。本申请降低功耗的方法是通过区分某一待处理数据中无需运算的部分,实现降低功耗的。 The parts to be calculated and the non-calculated parts are different parts of the same data respectively, that is, the parts to be calculated and the non-calculated parts together constitute the same data to be processed. The method of reducing power consumption in this application is to reduce power consumption by distinguishing parts of a certain data to be processed that do not require calculation.
02:对待计算部位进行运算,获得输出数据。02: Perform operations on the parts to be calculated and obtain output data.
对待计算部位进行的运算可以包括但不限于相加、相减、相乘、相除等运算方式,本申请不做具体限制。The operations performed on the parts to be calculated may include but are not limited to addition, subtraction, multiplication, division and other operations, which are not specifically limited in this application.
请参阅图2,本申请实施方式的处理器30包括预处理模块31和运算模块32。Referring to FIG. 2 , the processor 30 in the embodiment of the present application includes a preprocessing module 31 and an operation module 32 .
本申请的降低功耗的方法可由本申请实施方式的处理器30实现,其中,步骤01可以由预处理模块31实现,步骤02可以由运算模块32实现,也即是说,预处理模块31可用于根据目标指令,获取待处理数据的待计算部位,其中,待处理数据分为待计算部位和非计算部位。运算模块32可用于对待计算部位进行运算,获得输出数据。The method of reducing power consumption of the present application can be implemented by the processor 30 of the embodiment of the present application, wherein step 01 can be implemented by the preprocessing module 31 and step 02 can be implemented by the computing module 32. That is to say, the preprocessing module 31 can According to the target instruction, the part to be calculated of the data to be processed is obtained, wherein the data to be processed is divided into a part to be calculated and a non-calculated part. The operation module 32 can be used to perform operations on the parts to be calculated to obtain output data.
上述降低功耗的方法和处理器30中,能够将待处理数据分为待计算部位和非计算部位,在计算待处理数据的输出数据时,仅需计算待计算部位,而无需计算非计算部位,能够避免计算时的冗余,达到降低功耗的效果。In the above-mentioned method of reducing power consumption and the processor 30, the data to be processed can be divided into parts to be calculated and non-calculated parts. When calculating the output data of the data to be processed, only the parts to be calculated need to be calculated without calculating the non-calculated parts. , can avoid redundancy in calculation and achieve the effect of reducing power consumption.
在某些实施方式中,请参阅图3,处理器30可以应用于电子设备100。电子设备100可包括智能手机、平板电脑、智能手表、智能手环、等设备,在此不做具体限定。本申请实施方式的电子设备100以智能手机为例进行举例说明,不能理解为对本申请的限制。电子设备100还包括壳体50。In some implementations, referring to FIG. 3 , the processor 30 may be applied to the electronic device 100 . The electronic device 100 may include a smartphone, a tablet, a smart watch, a smart bracelet, and other devices, which are not specifically limited here. The electronic device 100 in the embodiment of the present application is explained by taking a smartphone as an example, which should not be understood as a limitation of the present application. Electronic device 100 also includes housing 50 .
具体的,相关技术中,可以通过降低电压、降低频率、电源门控等技术方式降低功耗。在另一些相关技术中,部分处理器配置了相应的寄存器屏蔽冗余计算,然而配置相应的寄存器会带来很多不必要的操作,并且额外的寄存器也会带来不必要的硬件开销。Specifically, in related technologies, power consumption can be reduced by reducing voltage, reducing frequency, power gating and other technical methods. In other related technologies, some processors are configured with corresponding registers to shield redundant calculations. However, configuring corresponding registers will bring many unnecessary operations, and additional registers will also bring unnecessary hardware overhead.
本申请无需配置寄存器,避免了额外的硬件开销,本申请能够根据目标指令,获取待处理数据的待计算部分,即可避免计算非计算部分,屏蔽冗余计算,达到降低功耗的效果。This application does not need to configure registers and avoids additional hardware overhead. This application can obtain the to-be-calculated part of the data to be processed according to the target instruction, thereby avoiding the calculation of the non-calculated part, shielding redundant calculations, and achieving the effect of reducing power consumption.
值得说明的是,步骤01可以是根据目标指令,获取若干待处理数据的待计算部分,在步骤02中对各个待计算部分进行运算,本申请不做具体限制。It is worth noting that step 01 may be to obtain several parts to be calculated of the data to be processed according to the target instruction, and perform operations on each part to be calculated in step 02, which is not specifically limited in this application.
在某些实施方式中,请参阅图4,目标指令包括目标计算指令,步骤02,包括:In some implementations, please refer to Figure 4, the target instructions include target calculation instructions, step 02, including:
021,根据目标计算指令,对待计算部位进行运算,获得输出数据。021. According to the target calculation instruction, perform operations on the parts to be calculated to obtain output data.
某些实施方式中,上述步骤可以由运算模块32实现,也即是说,运算模块32可用于根据目标计算指令,对待计算部位进行运算,获得输出数据。In some embodiments, the above steps can be implemented by the operation module 32. That is to say, the operation module 32 can be used to perform operations on the parts to be calculated according to the target calculation instructions to obtain output data.
如此,能够根据目标计算指令,获得输出数据。In this way, output data can be obtained according to the target calculation instructions.
具体的,目标指令可以包括目标计算指令,目标指令也可以包括其他指令。目标计算指令可以理解为既可以为获取待处理数据的待计算部位提供依据的目标指令,又 可以为对待计算部位进行运算提供依据的计算指令。为方便理解,下面进行举例说明。在某个实施方式中,待处理数据包括实数部和虚数部,目标指令包括将待处理数据的实数部相加的目标计算指令,则可以根据将待处理数据的实数部相加的目标计算指令,获取待处理数据的实数部;根据将待处理数据的实数部相加的目标计算指令,对实数部进行相加运算,获得输出数据。Specifically, the target instructions may include target calculation instructions, and the target instructions may also include other instructions. The target calculation instruction can be understood as a target instruction that can provide a basis for obtaining the part to be calculated of the data to be processed, and Calculation instructions that provide a basis for operations on the parts to be calculated. To facilitate understanding, an example is given below. In a certain embodiment, the data to be processed includes a real number part and an imaginary number part, and the target instruction includes a target calculation instruction that adds the real number part of the data to be processed, then the target calculation instruction that adds the real number part of the data to be processed can be used. , obtain the real number part of the data to be processed; according to the target calculation instruction of adding the real number part of the data to be processed, perform an addition operation on the real number part to obtain the output data.
进一步的,待处理数据包括实数部和虚数部,步骤01,包括:Further, the data to be processed includes the real part and the imaginary part. Step 01 includes:
根据目标计算指令,获取实数部或虚数部以作为待计算部位;According to the target calculation instruction, obtain the real number part or the imaginary number part as the part to be calculated;
步骤021,包括:Step 021, including:
根据目标计算指令,对实数部或虚数部进行运算,获得输出数据。According to the target calculation instruction, the real number part or the imaginary number part is operated to obtain the output data.
在某些实施方式中,根据目标计算指令,获取实数部或虚数部以作为待计算部位可以由预处理模块31实现,根据目标计算指令,对实数部或虚数部进行运算,获得输出数据可以由运算模块32实现,也即是说,预处理模块31可用于根据目标计算指令,获取实数部或虚数部以作为待计算部位。运算模块32可用于根据目标计算指令,对实数部或虚数部进行运算,获得输出数据。In some embodiments, according to the target calculation instruction, obtaining the real number part or the imaginary number part as the part to be calculated can be implemented by the preprocessing module 31. According to the target calculation instruction, performing operations on the real number part or the imaginary number part, and obtaining the output data can be performed by The operation module 32 implements, that is to say, the preprocessing module 31 can be used to obtain the real part or the imaginary part as the part to be calculated according to the target calculation instruction. The operation module 32 can be used to perform operations on the real number part or the imaginary number part according to the target calculation instruction to obtain output data.
如此,能够根据目标计算指令,获得需要计算的实数部或虚数部,屏蔽无需计算的部分,屏蔽冗余,从而降低功耗。In this way, the real number part or the imaginary number part that needs to be calculated can be obtained according to the target calculation instruction, and the parts that do not need to be calculated are shielded and redundant, thereby reducing power consumption.
具体的,目标计算指令可以包括但不限于实数部相加、实数部相乘、实数部相减、虚数部相加等计算指令,本申请不做具体限制。可以是根据目标指令,获取实数部,对实数部进行运算;也可以是根据目标指令获取虚数部,对虚数部进行运算,本申请不做具体限制。Specifically, the target calculation instructions may include but are not limited to calculation instructions such as addition of the real part, multiplication of the real part, subtraction of the real part, addition of the imaginary part, etc., which are not specifically limited in this application. The real number part can be obtained according to the target instruction and the real number part can be operated; or the imaginary number part can be obtained according to the target instruction and the imaginary number part can be operated. There is no specific limitation in this application.
为方便理解,下面进行举例说明。在相关技术中,请参阅图5上半部分,VSUB.SC16VR3 VR2 VR1表示相关技术中不包括实数部参数和虚数部参数的计算指令,VR1、VR2和VR3分别表示不同的向量寄存器,在该实施方式中,根据计算指令,使得向量寄存器VR1和向量寄存器VR2内的各个元素的实数部和虚数部对应相减,将结果放入向量寄存器VR3内。即相关技术中,需要一同计算虚数部,造成功耗增加。To facilitate understanding, an example is given below. In related technologies, please refer to the upper part of Figure 5. VSUB.SC16VR3 VR2 VR1 represents calculation instructions that do not include real part parameters and imaginary part parameters in related technologies. VR1, VR2 and VR3 respectively represent different vector registers. In this implementation In the method, according to the calculation instruction, the real part and the imaginary part of each element in the vector register VR1 and the vector register VR2 are subtracted correspondingly, and the result is placed in the vector register VR3. That is, in related technologies, the imaginary part needs to be calculated together, resulting in increased power consumption.
请参阅图5下半部分,VSUB.SC16.IMAG VR3 VR2 VR1表示本申请的目标计算指令,目标计算指令包括实数部参数或虚数部参数,例如IMAG是指虚数部参数,SUB是指相减,根据目标计算指令,只需将向量寄存器VR1和向量寄存器VR2里面的各个元素虚数部相减,实数部对应的运算单元被自动关闭,即可以获得输出数据,避免了计算实数部所产生的功耗,达到降低功耗的效果。可以理解的,目标计算指令的表示方式不仅限于VSUB.SC16.IMAG VR3 VR2 VR1,目标计算指令的表示方式还有很多,其具体表示方式本申请不做具体限制。 Please refer to the lower part of Figure 5. VSUB.SC16.IMAG VR3 VR2 VR1 represents the target calculation instruction of this application. The target calculation instruction includes real part parameters or imaginary part parameters. For example, IMAG refers to the imaginary part parameters, and SUB refers to subtraction. According to the target calculation instruction, you only need to subtract the imaginary part of each element in the vector register VR1 and vector register VR2, and the arithmetic unit corresponding to the real part is automatically turned off, and the output data can be obtained, avoiding the power consumption caused by calculating the real part. , to achieve the effect of reducing power consumption. It can be understood that the expression method of the target calculation instruction is not limited to VSUB.SC16.IMAG VR3 VR2 VR1. There are many other expression methods of the target calculation instruction, and the specific expression method is not specifically limited in this application.
在某些实施方式中,步骤01,包括:In some embodiments, step 01 includes:
根据目标计算指令,获取若干象限部中的至少一个象限部以作为待计算部位;According to the target calculation instruction, obtain at least one quadrant part among the plurality of quadrant parts as the part to be calculated;
步骤021,包括:Step 021, including:
根据目标计算指令,对若干象限部中至少一个象限部进行运算,获得输出数据。According to the target calculation instruction, at least one quadrant part among the plurality of quadrant parts is operated to obtain output data.
在某些实施方式中,根据目标计算指令,获取若干象限部中的至少一个象限部以作为待计算部位可以由预处理模块31实现,根据目标计算指令,对若干象限部中至少一个象限部进行运算,获得输出数据可以由运算模块32实现,也即是说,预处理模块31可用于根据目标计算指令,获取若干象限部中的至少一个象限部以作为待计算部位。运算模块32可用于根据目标计算指令,对若干象限部中至少一个象限部进行运算,获得输出数据。In some embodiments, according to the target calculation instruction, obtaining at least one quadrant part among the several quadrant parts as the part to be calculated can be implemented by the preprocessing module 31, and according to the target calculation instruction, performing at least one quadrant part among the several quadrant parts. The calculation and obtaining the output data can be implemented by the calculation module 32. That is to say, the pre-processing module 31 can be used to obtain at least one quadrant part among several quadrant parts as the part to be calculated according to the target calculation instruction. The operation module 32 may be used to perform operation on at least one quadrant part among several quadrant parts according to the target calculation instruction to obtain output data.
如此,能够根据目标计算指令,获得需要计算的象限部,屏蔽无需计算的部分,屏蔽冗余,从而降低功耗。In this way, the quadrants that need to be calculated can be obtained according to the target calculation instructions, and the parts that do not need to be calculated are shielded and redundant, thereby reducing power consumption.
具体的,待处理数据可以包括两个象限部、三个象限部、四个象限部等,本申请不做具体限制。目标计算指令可以包括但不限于第一象限部相加、第二象限部相减等,本申请不做具体限制。可以根据目标指令,获取需要计算的象限部,对需要计算的象限部进行运算。Specifically, the data to be processed may include two quadrants, three quadrants, four quadrants, etc., which are not specifically limited in this application. The target calculation instructions may include but are not limited to addition in the first quadrant, subtraction in the second quadrant, etc., and are not specifically limited in this application. According to the target instruction, the quadrant that needs to be calculated can be obtained, and the operation can be performed on the quadrant that needs to be calculated.
为方便理解,下面进行举例说明。在相关技术中,请参阅图6上半部分,以待处理数据包括四个象限部为例,VADD.S8 VR3 VR2 VR1表示相关技术中不包括象限部参数的计算指令,VR1、VR2和VR3分别表示不同的向量寄存器,在该实施方式中,根据计算指令,将向量寄存器VR1和向量寄存器VR2内的各个元素的象限数据对应相加,结果放入向量寄存器VR3内。即相关技术中,需要一同计算其他三个象限部的数据,造成功耗的浪费。To facilitate understanding, an example is given below. In related technologies, please refer to the upper part of Figure 6. Taking the data to be processed as including four quadrants as an example, VADD.S8 VR3 VR2 VR1 represents calculation instructions that do not include quadrant parameters in related technologies. VR1, VR2 and VR3 are respectively represents different vector registers. In this embodiment, according to the calculation instruction, the quadrant data of each element in the vector register VR1 and the vector register VR2 are added correspondingly, and the result is placed in the vector register VR3. That is, in the related technology, the data of the other three quadrants need to be calculated together, resulting in a waste of power consumption.
请参阅图6下半部分,VADD.S8.Q1 VR3 VR2 VR1表示本申请的目标计算指令,目标计算指令包括第一象限部的参数,例如,Q1是指第一象限的参数,ADD表示相加,根据目标计算指令,只需将VR1和VR2里面的各个元素第一象限部的数据相加,即可以获得输出数据,避免了计算其他象限部所产生的功耗,达到降低功耗的效果。可以理解的,目标计算指令的表示方式不仅限于VADD.S8.Q1 VR3 VR2 VR1,目标计算指令的表示方式还有很多,其具体表示方式本申请不做具体限制。Please refer to the lower part of Figure 6. VADD.S8.Q1 VR3 VR2 VR1 represents the target calculation instruction of this application. The target calculation instruction includes the parameters of the first quadrant. For example, Q1 refers to the parameters of the first quadrant, and ADD means addition. , according to the target calculation instructions, the output data can be obtained by simply adding the data of the first quadrant of each element in VR1 and VR2, avoiding the power consumption caused by calculating other quadrants, and achieving the effect of reducing power consumption. It is understandable that the expression method of target calculation instructions is not limited to VADD.S8.Q1 VR3 VR2 VR1. There are many other expression methods of target calculation instructions, and the specific expression methods are not specifically limited in this application.
在某些实施方式中,步骤021,包括:In some embodiments, step 021 includes:
对待计算部位进行运算时,关闭与非计算部位对应的运算单元。When performing operations on the parts to be calculated, close the operation units corresponding to the non-calculated parts.
上述步骤可以由运算模块32实现,也即是说,运算模块32可用于对待计算部位进行运算时,关闭与非计算部位对应的运算单元。 The above steps can be implemented by the computing module 32. That is to say, the computing module 32 can be used to close the computing unit corresponding to the non-computing part when performing calculations on the part to be calculated.
如此,能够避免非计算部位对应的运算单元产生功耗。In this way, power consumption of the computing unit corresponding to the non-computing part can be avoided.
在某些实施方式中,请参阅图7,降低功耗的方法还包括:In some implementations, referring to Figure 7, the method of reducing power consumption also includes:
04:在处理器处于目标模式且接收到计算指令的情况下,将计算指令确定为目标指令;04: When the processor is in target mode and receives a calculation instruction, determine the calculation instruction as a target instruction;
步骤01,包括:Step 01, including:
011:根据目标模式和计算指令,获取待处理数据的有效位以作为待计算部位。011: According to the target mode and calculation instructions, obtain the valid bits of the data to be processed as the parts to be calculated.
在某些实施方式中,请参阅图8,处理器30还包括处理模块33,步骤04可以由处理模块33实现,步骤011可以由预处理模块31实现,也即是说,处理模块33可用于在处理器处于目标模式且接收到计算指令的情况下,将计算指令确定为目标指令。预处理模块31可用于根据目标模式和计算指令,获取待处理数据的有效位以作为待计算部位。In some embodiments, please refer to FIG. 8 , the processor 30 also includes a processing module 33 , step 04 can be implemented by the processing module 33 , and step 011 can be implemented by the preprocessing module 31 , that is to say, the processing module 33 can be used for When the processor is in target mode and a calculation instruction is received, the calculation instruction is determined to be the target instruction. The preprocessing module 31 can be used to obtain the valid bits of the data to be processed as the parts to be calculated according to the target mode and calculation instructions.
如此,能够仅计算待处理数据的有效位,达到降低功耗的效果。In this way, only the effective bits of the data to be processed can be calculated, thereby reducing power consumption.
具体的,目标模式可以包括低精度模式、半精度模式等模式。目标模式包括输出数据的精度要求。可以理解的,在某些应用场景中,待处理数据的精度超出满足功能需求的精度,而相关技术中,通常为按照待处理数据的精度计算待处理数据后,再对计算结果进行移位、饱和、截位等处理,造成功耗的浪费。本申请实施方式,可以根据目标模式的精度,在计算之前对待处理数据进行预处理,获取待处理数据的有效位作为待计算部位,达到降低功耗的效果。值得说明的是,可以包括多个目标模式,多个目标模式对应不同的精度要求,以便于根据精度要求确定对应的目标模式。值得说明的是,处理器是否处于目标模式,可以根据应用场景、处理器的负荷等因素进行调整,本申请不做具体限定。Specifically, the target mode may include a low-precision mode, a half-precision mode, and other modes. The target mode includes the accuracy requirements for the output data. It is understandable that in some application scenarios, the accuracy of the data to be processed exceeds the accuracy that meets the functional requirements. In related technologies, the data to be processed is usually calculated according to the accuracy of the data to be processed, and then the calculation results are shifted and Processing such as saturation and truncation results in a waste of power consumption. According to the embodiment of the present application, the data to be processed can be preprocessed before calculation according to the accuracy of the target mode, and the valid bits of the data to be processed are obtained as the parts to be calculated, thereby achieving the effect of reducing power consumption. It is worth noting that multiple target modes may be included, and the multiple target modes correspond to different accuracy requirements, so that the corresponding target mode can be determined according to the accuracy requirements. It is worth noting that whether the processor is in target mode can be adjusted according to application scenarios, processor load and other factors, and is not specifically limited in this application.
为方便理解,下面进行举例论述。在某个实施方式中,待处理数据为32bit的数据,而16bit的计算精度已经能够满足功能需求,处理器处于16bit的计算精度的目标模式且接收到计算指令的情况下,将计算指令确定为目标指令,根据目标模式和计算指令,获取待处理数据的16bit有效位作为待计算部位。具体的,结合可重构的设计方法,例如,由4个16bit乘法器拼成1个32bit乘法器,则在计算32bit的待处理数据之前,获取16bit的待计算部位,并通过1个16bit乘法器计算16bit的待计算部位,相较于使用4个16bit乘法器,使得一条乘法指令可降低近75%的功耗。To facilitate understanding, examples are discussed below. In a certain implementation, the data to be processed is 32-bit data, and the 16-bit calculation accuracy can already meet the functional requirements. When the processor is in the target mode of 16-bit calculation accuracy and receives the calculation instruction, the calculation instruction is determined as The target instruction, according to the target mode and calculation instruction, obtains the 16-bit valid bits of the data to be processed as the part to be calculated. Specifically, combined with a reconfigurable design method, for example, if four 16-bit multipliers are combined into one 32-bit multiplier, then before calculating the 32-bit data to be processed, the 16-bit part to be calculated is obtained and passed through a 16-bit multiplication The processor calculates the 16-bit parts to be calculated, so that one multiplication instruction can reduce power consumption by nearly 75% compared to using four 16-bit multipliers.
值得说明的是,在处理器处于目标模式,但未接收到计算指令的情况下,处理器未开始计算,即处理器未开始获得待处理数据,因此,需要在处理器处于目标模式且接收到计算指令的情况下,再进行获取待处理数据的有效位以作为待计算部位的步骤。It is worth mentioning that when the processor is in target mode but does not receive a calculation instruction, the processor does not start calculation, that is, the processor does not start to obtain the data to be processed. Therefore, it is necessary to wait until the processor is in target mode and receives In the case of a calculation instruction, the step of obtaining the valid bits of the data to be processed as the part to be calculated is then performed.
需要补充的是,本实施方式的计算指令可以包括上述目标计算指令,在计算指令 为目标计算指令时,可以结合目标模式和目标计算指令,获取待处理数据的待计算部位。为方便理解,下面进行举例说明。在某些实施方式中,待处理数据包括32bit的实数部和32bit的虚数部,计算指令包括将实数部相加,目标模式包括精度要求为16bit,则可以根据目标模式和计算指令,获取待处理数据的实数部的高16bit有效位的部分作为待计算部位。It should be added that the calculation instructions in this embodiment may include the above target calculation instructions. In the calculation instructions When issuing a target calculation instruction, you can combine the target mode and the target calculation instruction to obtain the part to be calculated of the data to be processed. To facilitate understanding, an example is given below. In some embodiments, the data to be processed includes a 32-bit real number part and a 32-bit imaginary number part. The calculation instruction includes adding the real number part. The target mode includes an accuracy requirement of 16 bits. Then, the data to be processed can be obtained according to the target mode and the calculation instruction. The high 16-bit significant bits of the real number part of the data are used as the parts to be calculated.
在某些实施方式中,请参阅图9,步骤01,包括:In some embodiments, see Figure 9, step 01 includes:
015:加载待计算部分。015: Load the part to be calculated.
在某些实施方式中,上述步骤可以由预处理模块31实现,也即是说,预处理模块31可用于加载待计算部分。In some embodiments, the above steps can be implemented by the preprocessing module 31, that is to say, the preprocessing module 31 can be used to load the part to be calculated.
如此,能够只加载待计算部分并忽略非计算部分,从而避免加载非计算部分造成的功耗的浪费。In this way, only the part to be calculated can be loaded and the non-calculated part can be ignored, thereby avoiding the waste of power consumption caused by loading the non-calculated part.
在某些实施方式中,降低功耗的方法还包括:In some embodiments, the method of reducing power consumption also includes:
根据计算指令,对待处理数据进行运算,获得运算结果;According to the calculation instructions, perform operations on the data to be processed and obtain the operation results;
根据目标指令,获取运算结果的待存储部分,其中,运算结果分为待存储部分和非存储部分;According to the target instruction, obtain the part to be stored of the operation result, where the operation result is divided into the part to be stored and the non-storage part;
将待存储部分存储,丢弃非存储部分。Store the part to be stored and discard the non-storage part.
在某些实施方式中,处理器30还可以包括存储模块,上述步骤可以由存储模块实现,也即是说,存储模块可用于根据计算指令,对待处理数据进行运算,获得运算结果;根据目标指令,获取运算结果的待存储部分,其中,运算结果分为待存储部分和非存储部分;将待存储部分存储,丢弃非存储部分。In some embodiments, the processor 30 may also include a storage module, and the above steps may be implemented by the storage module. That is to say, the storage module may be used to perform operations on the data to be processed according to the computing instructions to obtain operation results; according to the target instructions , obtain the part to be stored of the operation result, where the operation result is divided into the part to be stored and the non-storage part; store the part to be stored and discard the non-storage part.
如此,能够获取待存储部分并存储待存储部分,避免存储非存储部分而产生的功耗,达到降低功耗的效果。In this way, the part to be stored can be obtained and the part to be stored can be avoided, thereby avoiding the power consumption caused by storing the non-storage part, thereby achieving the effect of reducing power consumption.
具体的,在某些应用场景下,例如运算结果包括第一象限部、第二象限部、第三象限部以及第四象限部,而仅需要存储第一象限部,此时,可以获取第一象限部并存储第一象限部,丢弃第二象限部、第三象限部和第四象限部,避免存储第二象限部、第三象限部和第四象限部所产生的功耗。Specifically, in some application scenarios, for example, the operation result includes the first quadrant, the second quadrant, the third quadrant, and the fourth quadrant, and only the first quadrant needs to be stored. In this case, the first quadrant can be obtained. Quadrant part and store the first quadrant part, discard the second quadrant part, the third quadrant part and the fourth quadrant part, and avoid the power consumption caused by storing the second quadrant part, the third quadrant part and the fourth quadrant part.
本申请实施方式的计算机可读存储介质,其上存储有计算机程序,程序被处理器执行的情况下,实现上述任一实施方式的降低功耗的方法的步骤。The computer-readable storage medium in the embodiment of the present application has a computer program stored thereon. When the program is executed by the processor, the steps of the method for reducing power consumption in any of the above embodiments are implemented.
例如,01:根据目标指令,获取待处理数据的待计算部位,其中,待处理数据分为待计算部位和非计算部位;For example, 01: According to the target instruction, obtain the part to be calculated of the data to be processed, where the data to be processed is divided into the part to be calculated and the non-calculated part;
02:对待计算部位进行运算,获得输出数据。02: Perform operations on the parts to be calculated and obtain output data.
可以理解,计算机程序包括计算机程序代码。计算机程序代码可以为源代码形式、 对象代码形式、可执行文件或某些中间形式等。计算机可读存储介质可以包括:能够携带计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、以及软件分发介质等。处理器可以是中央处理器,还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。It will be understood that a computer program includes computer program code. Computer program code may be in the form of source code, Object code form, executable file or some intermediate form, etc. Computer-readable storage media can include: any entity or device that can carry computer program code, recording media, USB flash drives, mobile hard drives, magnetic disks, optical disks, computer memory, read-only memory (ROM, Read-Only Memory), random access memory Access memory (RAM, Random Access Memory), and software distribution media, etc. The processor can be a central processing unit, or other general-purpose processor, Digital Signal Processor (DSP), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (Field-Programmable Gate) Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, reference to the terms "one embodiment," "some embodiments," "an example," "specific examples," or "some examples" or the like means that specific features are described in connection with the embodiment or example. , structures, materials or features are included in at least one embodiment or example of the present application. In this specification, the schematic expressions of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, those skilled in the art may combine and combine different embodiments or examples and features of different embodiments or examples described in this specification unless they are inconsistent with each other.
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现特定逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本申请的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本申请的实施例所属技术领域的技术人员所理解。Any process or method descriptions in flowcharts or otherwise described herein may be understood to represent modules, segments, or portions of code that include one or more executable instructions for implementing the specified logical functions or steps of the process. , and the scope of the preferred embodiments of the present application includes additional implementations in which functions may be performed out of the order shown or discussed, including in a substantially simultaneous manner or in the reverse order, depending on the functionality involved, which shall It should be understood by those skilled in the technical field to which the embodiments of this application belong.
尽管上面已经示出和描述了本申请的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本申请的限制,本领域的普通技术人员在本申请的范围内可以对上述实施例进行变化、修改、替换和变型。 Although the embodiments of the present application have been shown and described above, it can be understood that the above-mentioned embodiments are illustrative and cannot be understood as limitations of the present application. Those of ordinary skill in the art can make modifications to the above-mentioned embodiments within the scope of the present application. The embodiments are subject to changes, modifications, substitutions and variations.

Claims (22)

  1. 一种降低功耗的方法,其特征在于,所述降低功耗的方法包括:A method of reducing power consumption, characterized in that the method of reducing power consumption includes:
    根据目标指令,确定待处理数据的待计算部位,其中,所述待处理数据包括所述待计算部位和非计算部位;According to the target instruction, determine the part to be calculated of the data to be processed, wherein the data to be processed includes the part to be calculated and the non-calculated part;
    对所述待计算部位进行运算,获得输出数据。Perform operations on the parts to be calculated to obtain output data.
  2. 根据权利要求1所述的降低功耗的方法,其特征在于,所述目标指令包括目标计算指令,The method for reducing power consumption according to claim 1, wherein the target instructions include target calculation instructions,
    所述对所述待计算部位进行运算,获得输出数据,包括:The operation on the part to be calculated to obtain output data includes:
    根据所述目标计算指令,对所述待计算部位进行运算,获得所述输出数据。According to the target calculation instruction, the part to be calculated is calculated to obtain the output data.
  3. 根据权利要求2所述的降低功耗的方法,其特征在于,所述待处理数据包括实数部和虚数部,The method for reducing power consumption according to claim 2, wherein the data to be processed includes a real number part and an imaginary number part,
    所述根据目标指令,获取待处理数据的待计算部位,包括:The step of obtaining the to-be-calculated part of the data to be processed according to the target instruction includes:
    根据所述目标计算指令,获取所述实数部或所述虚数部以作为所述待计算部位;According to the target calculation instruction, obtain the real number part or the imaginary number part as the part to be calculated;
    所述根据所述目标计算指令,对所述待计算部位进行运算,获得所述输出数据,包括:The step of performing operations on the part to be calculated according to the target calculation instruction to obtain the output data includes:
    根据所述目标计算指令,对所述实数部或所述虚数部进行运算,获得所述输出数据。According to the target calculation instruction, the real number part or the imaginary number part is operated to obtain the output data.
  4. 根据权利要求2所述的降低功耗的方法,其特征在于,所述待处理数据包括若干象限部,The method for reducing power consumption according to claim 2, wherein the data to be processed includes several quadrants,
    所述根据目标指令,获取待处理数据的待计算部位,包括:The step of obtaining the to-be-calculated part of the data to be processed according to the target instruction includes:
    根据所述目标计算指令,获取若干所述象限部中的至少一个象限部以作为所述待计算部位;According to the target calculation instruction, obtain at least one quadrant part among a plurality of quadrant parts as the part to be calculated;
    所述根据所述目标计算指令,对所述待计算部位进行运算,获得所述输出数据,包括:The step of performing operations on the part to be calculated according to the target calculation instruction to obtain the output data includes:
    根据所述目标计算指令,对若干所述象限部中至少一个象限部进行运算,获得所述输出数据。According to the target calculation instruction, at least one quadrant part among a plurality of quadrant parts is calculated to obtain the output data.
  5. 根据权利要求2所述的降低功耗的方法,其特征在于,所述根据所述目标计算指令,对所述待计算部位进行运算,获得所述输出数据,包括:The method of reducing power consumption according to claim 2, characterized in that, performing operations on the parts to be calculated according to the target calculation instructions to obtain the output data includes:
    对所述待计算部位进行运算时,关闭与所述非计算部位对应的运算单元。 When the operation is performed on the part to be calculated, the operation unit corresponding to the non-calculation part is turned off.
  6. 根据权利要求1所述的降低功耗的方法,其特征在于,所述降低功耗的方法还包括:The method for reducing power consumption according to claim 1, characterized in that the method for reducing power consumption further includes:
    在所述处理器处于目标模式且接收到计算指令的情况下,将所述计算指令确定为所述目标指令;When the processor is in target mode and a calculation instruction is received, determining the calculation instruction as the target instruction;
    所述根据目标指令,获取待处理数据的待计算部位,包括:The step of obtaining the to-be-calculated part of the data to be processed according to the target instruction includes:
    根据所述目标模式和所述计算指令,获取所述待处理数据的有效位以作为所述待计算部位。According to the target mode and the calculation instruction, the valid bits of the data to be processed are obtained as the parts to be calculated.
  7. 根据权利要求1所述的降低功耗的方法,其特征在于,所述根据目标指令,获取待处理数据的待计算部位,包括:The method of reducing power consumption according to claim 1, characterized in that, according to the target instruction, obtaining the to-be-calculated part of the data to be processed includes:
    加载所述待计算部分。Load the portion to be calculated.
  8. 一种处理器,其特征在于,所述处理器包括:A processor, characterized in that the processor includes:
    预处理模块,用于根据目标指令,获取待处理数据的待计算部位,其中,所述待处理数据分为所述待计算部位和非计算部位;A preprocessing module, configured to obtain the parts to be calculated of the data to be processed according to the target instructions, where the data to be processed is divided into the parts to be calculated and the non-calculated parts;
    运算模块,用于对所述待计算部位进行运算,获得输出数据。An operation module is used to perform operations on the parts to be calculated and obtain output data.
  9. 根据权利要求8所述的处理器,其特征在于,所述目标指令包括目标计算指令,The processor of claim 8, wherein the target instructions comprise target computation instructions,
    所述运算模块用于:The computing module is used for:
    根据所述目标计算指令,对所述待计算部位进行运算,获得所述输出数据。According to the target calculation instruction, the part to be calculated is calculated to obtain the output data.
  10. 根据权利要求9所述的处理器,其特征在于,所述待处理数据包括实数部和虚数部,The processor according to claim 9, wherein the data to be processed includes a real number part and an imaginary number part,
    所述预处理模块用于:The preprocessing module is used for:
    根据所述目标计算指令,获取所述实数部或所述虚数部以作为所述待计算部位;According to the target calculation instruction, obtain the real number part or the imaginary number part as the part to be calculated;
    所述运算模块用于:The computing module is used for:
    根据所述目标计算指令,对所述实数部或所述虚数部进行运算,获得所述输出数据。According to the target calculation instruction, the real number part or the imaginary number part is operated to obtain the output data.
  11. 根据权利要求9所述的处理器,其特征在于,所述待处理数据包括若干象限部,The processor according to claim 9, wherein the data to be processed includes a plurality of quadrants,
    所述预处理模块用于:The preprocessing module is used for:
    根据所述目标计算指令,获取若干所述象限部中的至少一个象限部以作为所述待计算部位; According to the target calculation instruction, obtain at least one quadrant part among a plurality of quadrant parts as the part to be calculated;
    所述运算模块用于:The computing module is used for:
    根据所述目标计算指令,对若干所述象限部中至少一个象限部进行运算,获得所述输出数据。According to the target calculation instruction, at least one quadrant part among a plurality of quadrant parts is calculated to obtain the output data.
  12. 根据权利要求9所述的处理器,其特征在于,所述运算模块用于:The processor according to claim 9, characterized in that the computing module is used for:
    对所述待计算部位进行运算时,关闭与所述非计算部位对应的运算单元。When the operation is performed on the part to be calculated, the operation unit corresponding to the non-calculation part is turned off.
  13. 根据权利要求8所述的处理器,其特征在于,所述处理器还包括处理模块,所述处理模块用于:The processor according to claim 8, characterized in that the processor further includes a processing module, the processing module is used for:
    在所述处理器处于目标模式且接收到计算指令的情况下,将所述计算指令确定为所述目标指令;When the processor is in target mode and a calculation instruction is received, determining the calculation instruction as the target instruction;
    所述预处理模块用于:The preprocessing module is used for:
    根据所述目标模式和所述计算指令,获取所述待处理数据的有效位以作为所述待计算部位。According to the target mode and the calculation instruction, the valid bits of the data to be processed are obtained as the parts to be calculated.
  14. 根据权利要求8所述的处理器,其特征在于,所述预处理模块用于:The processor according to claim 8, characterized in that the preprocessing module is used for:
    加载所述待计算部分。Load the portion to be calculated.
  15. 一种电子设备,其特征在于,所述电子设备包括壳体和处理器,所述处理器包括:An electronic device, characterized in that the electronic device includes a housing and a processor, and the processor includes:
    预处理模块,用于根据目标指令,获取待处理数据的待计算部位,其中,所述待处理数据分为所述待计算部位和非计算部位;A preprocessing module, configured to obtain the parts to be calculated of the data to be processed according to the target instructions, where the data to be processed is divided into the parts to be calculated and the non-calculated parts;
    运算模块,用于对所述待计算部位进行运算,获得输出数据。An operation module is used to perform operations on the parts to be calculated and obtain output data.
  16. 根据权利要求15所述的电子设备,其特征在于,所述目标指令包括目标计算指令,The electronic device according to claim 15, wherein the target instructions include target calculation instructions,
    所述运算模块用于:The computing module is used for:
    根据所述目标计算指令,对所述待计算部位进行运算,获得所述输出数据。According to the target calculation instruction, the part to be calculated is calculated to obtain the output data.
  17. 根据权利要求16所述的电子设备,其特征在于,所述待处理数据包括实数部和虚数部,The electronic device according to claim 16, wherein the data to be processed includes a real number part and an imaginary number part,
    所述预处理模块用于:The preprocessing module is used for:
    根据所述目标计算指令,获取所述实数部或所述虚数部以作为所述待计算部位; According to the target calculation instruction, obtain the real number part or the imaginary number part as the part to be calculated;
    所述运算模块用于:The computing module is used for:
    根据所述目标计算指令,对所述实数部或所述虚数部进行运算,获得所述输出数据。According to the target calculation instruction, the real number part or the imaginary number part is operated to obtain the output data.
  18. 根据权利要求16所述的电子设备,其特征在于,所述待处理数据包括若干象限部,The electronic device according to claim 16, characterized in that the data to be processed includes a plurality of quadrants,
    所述预处理模块用于:The preprocessing module is used for:
    根据所述目标计算指令,获取若干所述象限部中的至少一个象限部以作为所述待计算部位;According to the target calculation instruction, obtain at least one quadrant part among a plurality of quadrant parts as the part to be calculated;
    所述运算模块用于:The computing module is used for:
    根据所述目标计算指令,对若干所述象限部中至少一个象限部进行运算,获得所述输出数据。According to the target calculation instruction, at least one quadrant part among a plurality of quadrant parts is calculated to obtain the output data.
  19. 根据权利要求16所述的电子设备,其特征在于,所述运算模块用于:The electronic device according to claim 16, characterized in that the computing module is used for:
    对所述待计算部位进行运算时,关闭与所述非计算部位对应的运算单元。When the operation is performed on the part to be calculated, the operation unit corresponding to the non-calculation part is turned off.
  20. 根据权利要求15所述的电子设备,其特征在于,所述处理器还包括处理模块,所述处理模块用于:The electronic device according to claim 15, characterized in that the processor further includes a processing module, the processing module is used for:
    在所述处理器处于目标模式且接收到计算指令的情况下,将所述计算指令确定为所述目标指令;When the processor is in target mode and a calculation instruction is received, determining the calculation instruction as the target instruction;
    所述预处理模块用于:The preprocessing module is used for:
    根据所述目标模式和所述计算指令,获取所述待处理数据的有效位以作为所述待计算部位。According to the target mode and the calculation instruction, the valid bits of the data to be processed are obtained as the parts to be calculated.
  21. 根据权利要求15所述的电子设备,其特征在于,所述预处理模块用于:The electronic device according to claim 15, characterized in that the preprocessing module is used for:
    加载所述待计算部分。Load the portion to be calculated.
  22. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述程序被处理器执行的情况下,实现权利要求1-7任一项所述的降低功耗的方法的步骤。 A computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the steps of the method for reducing power consumption described in any one of claims 1-7 are implemented.
PCT/CN2023/089977 2022-08-11 2023-04-23 Method for reducing power consumption, and processor, electronic device and storage medium WO2024032027A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210960729.8A CN115390654A (en) 2022-08-11 2022-08-11 Method for reducing power consumption, processor, electronic device and storage medium
CN202210960729.8 2022-08-11

Publications (1)

Publication Number Publication Date
WO2024032027A1 true WO2024032027A1 (en) 2024-02-15

Family

ID=84118554

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/089977 WO2024032027A1 (en) 2022-08-11 2023-04-23 Method for reducing power consumption, and processor, electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN115390654A (en)
WO (1) WO2024032027A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115390654A (en) * 2022-08-11 2022-11-25 Oppo广东移动通信有限公司 Method for reducing power consumption, processor, electronic device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4870593A (en) * 1988-04-08 1989-09-26 Tektronix, Inc. Circuit and method for determining the phase angle of a complex electrical signal
CN1855031A (en) * 2005-04-18 2006-11-01 展讯通信(上海)有限公司 Use of fixed-point divide in video encode stream control
CN107729989A (en) * 2017-07-20 2018-02-23 上海寒武纪信息科技有限公司 A kind of device and method for being used to perform artificial neural network forward operation
CN113986194A (en) * 2021-10-09 2022-01-28 清华大学 Neural network approximate multiplier implementation method and device based on preprocessing
WO2022028577A1 (en) * 2020-08-06 2022-02-10 北京灵汐科技有限公司 Processing mode determining method, and data processing method
CN115390654A (en) * 2022-08-11 2022-11-25 Oppo广东移动通信有限公司 Method for reducing power consumption, processor, electronic device and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4870593A (en) * 1988-04-08 1989-09-26 Tektronix, Inc. Circuit and method for determining the phase angle of a complex electrical signal
CN1855031A (en) * 2005-04-18 2006-11-01 展讯通信(上海)有限公司 Use of fixed-point divide in video encode stream control
CN107729989A (en) * 2017-07-20 2018-02-23 上海寒武纪信息科技有限公司 A kind of device and method for being used to perform artificial neural network forward operation
WO2022028577A1 (en) * 2020-08-06 2022-02-10 北京灵汐科技有限公司 Processing mode determining method, and data processing method
CN113986194A (en) * 2021-10-09 2022-01-28 清华大学 Neural network approximate multiplier implementation method and device based on preprocessing
CN115390654A (en) * 2022-08-11 2022-11-25 Oppo广东移动通信有限公司 Method for reducing power consumption, processor, electronic device and storage medium

Also Published As

Publication number Publication date
CN115390654A (en) 2022-11-25

Similar Documents

Publication Publication Date Title
US11281965B2 (en) Reconfigurable processing unit
JP5175379B2 (en) Floating point processor with selectable lower precision
US11023206B2 (en) Dot product calculators and methods of operating the same
KR100947138B1 (en) Method, apparatus, system and machine-readable medium for performing rounding operations responsive to an instruction
JP5883462B2 (en) Instructions and logic for range detection
US20140365548A1 (en) Vector matrix product accelerator for microprocessor integration
TW201734764A (en) Memory reduction method for fixed point matrix multiply
JP6907310B2 (en) Dynamically variable precision calculation
JPH10187438A (en) Method for reducing transition to input of multiplier
US11967952B2 (en) Electronic system including FPGA and operation method thereof
WO2024032027A1 (en) Method for reducing power consumption, and processor, electronic device and storage medium
US7519646B2 (en) Reconfigurable SIMD vector processing system
KR20210028075A (en) System to perform unary functions using range-specific coefficient sets
EP1335278A2 (en) Higher precision divide and square root approximations
US8140608B1 (en) Pipelined integer division using floating-point reciprocal
US8972471B2 (en) Arithmetic module, device and system
US10289386B2 (en) Iterative division with reduced latency
US20220156567A1 (en) Neural network processing unit for hybrid and mixed precision computing
US8938485B1 (en) Integer division using floating-point reciprocal
US20230098421A1 (en) Method and apparatus of dynamically controlling approximation of floating-point arithmetic operations
WO2022220835A1 (en) Shared register for vector register file and scalar register file
US20210382687A1 (en) Circuitry for Floating-point Power Function
JP4015411B2 (en) Arithmetic device and information processing apparatus using the arithmetic device
US10564931B1 (en) Floating-point arithmetic operation range exception override circuit
US7580967B2 (en) Processor with maximum and minimum instructions

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23851257

Country of ref document: EP

Kind code of ref document: A1