WO2009017849A1 - Method and system for creating fixed-point software code - Google Patents

Method and system for creating fixed-point software code Download PDF

Info

Publication number
WO2009017849A1
WO2009017849A1 PCT/US2008/056529 US2008056529W WO2009017849A1 WO 2009017849 A1 WO2009017849 A1 WO 2009017849A1 US 2008056529 W US2008056529 W US 2008056529W WO 2009017849 A1 WO2009017849 A1 WO 2009017849A1
Authority
WO
WIPO (PCT)
Prior art keywords
bit width
accordance
variable
fractional bit
software code
Prior art date
Application number
PCT/US2008/056529
Other languages
French (fr)
Inventor
Kiak Wei Khoo
Kambiz Homayounfar
Original Assignee
Phybit Pte. Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Phybit Pte. Ltd. filed Critical Phybit Pte. Ltd.
Priority to PCT/US2008/056529 priority Critical patent/WO2009017849A1/en
Publication of WO2009017849A1 publication Critical patent/WO2009017849A1/en

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/14Conversion to or from non-weighted codes
    • H03M7/24Conversion to or from floating-point codes

Definitions

  • This invention relates generally to converting floating-point numeric representations to fixed-point numeric representations and, more particularly, to generating fixed-point software code from floating-point software code.
  • At least some known embedded systems incorporate high-complexity digital signal processing algorithms and/or applications. These algorithms and/or applications are typically designed or formulated with a high level programming language such as MATLAB (MathWork), CoCentric (Synopsys), or SPW (CoWare) and their performance is evaluated using floating-point based simulation. In actual implementation and use, however, such algorithms and/or applications typically are in fixed-point data type on an embedded processor, ASIC or FPGA. Such fixed-point numeric representation is utilized to facilitate, for example, avoiding complex practical implementation issues, such as computation overflow, and to speed up the algorithm development process. Fixed-point numeric representation also typically is utilized to facilitate portability across multiple platforms. In addition, for FPGA and ASIC designs, an optimized fixed-point implementation, e.g. the minimum bit-width required, is sought in order to facilitate reduced size, cost, as well as power consumption in comparison to less optimum implementations.
  • At least some of the currently known automated, or partially automated, methodologies are based on the bit-true simulation of the fixed-point implementation method 100 as shown in Figure 1.
  • global constraints such as error margins and known variables fixed-point format are entered 102 into a system.
  • a required fixed-point data format for other variables are determined 104 based on predetermined propagation rules.
  • a bit-true simulation based on a huge amount of test vectors is then performed 106. If the result of the simulation does not satisfy 108 the constraints input, for example, the tolerable errors, the whole process repeats.
  • Executing method 100 can be very time- consuming and may not converge if the constraints are not reachable.
  • a data converter configured to automatically generate fixed-point software code from floating-point code.
  • the data converter includes a processor configured to receive a signal flow representation.
  • the processor is configured to determine a range of at least one variable in the signal flow representation and determine a largest fractional bit width of the at least one variable.
  • the processor is further configured to determine a numerical value lost with the determined fractional bit width and "fine tune" the fractional bit width based on the operations to be performed.
  • a method of converting a first set of program code into a second set of program code includes one or more floating-point arithmetic operations to be performed on one or more floating-point variables defined therein.
  • the second set of program code includes one or more fixed-point arithmetic operations having been substituted for the one or more floating-point arithmetic operations.
  • the method includes defining a representation of the operations to be performed in the first set of software code, determining a range of at least one variable along the representation, and determining a largest fractional bit width of the at least one variable.
  • the method also includes determining a numerical value lost with the determined fractional bit width and "fine tuning" the fractional bit width required based on the operations to be performed.
  • a software code converter includes a range analyzer configured to determine a range of at least one variable of an application embodied in software code to be converted, a precision analyzer configured to determine a largest fractional bit width of the at least one variable with the fractional bit width of the final result being fixed, an error analyzer configured to determine an error function along the range of the at least one variable, and an operations analyzer configured to fine-tune the fractional bit based on the operation performed and the fractional bit requirement in the result.
  • Figure 1 is a flow chart of an exemplary method of a known bit-true simulation of the fixed-point implementation method
  • Figure 2 is a flow chart of an exemplary method of converting floating point software code to a fixed-point implementation based on Affine Arithmetic
  • Figure 3A-B is a source listing of exemplary Mathematica code that is configured to implement a Range Analysis portion of the method shown in Figure 2;
  • Figure 4A-B is a source listing of exemplary Mathematica code that is configured to implement a Precision Analysis portion of the method shown in Figure 2;
  • Figure 5 is a graph illustrating an exemplary plot-out of the Uniform Bit Width Affine Error Function.
  • Figure 6A-C is a source listing of exemplary Mathematica code that is configured to implement an Error Analysis portion of the method shown in Figure 2;
  • Figure 7 is a schematic block diagram of an exemplary data converter system 700 for automatically converting floating point software code to fixed point software code.
  • Figure 2 is a flow chart of an exemplary method 200 of converting floating point software code to a fixed-point implementation based on Aff ⁇ ne Arithmetic.
  • Figure 3A-B is a source listing 300 of exemplary Mathematica code that is configured to implement a Range Analysis 204 portion of method 200.
  • Figure 4A-B is a source listing 400 of exemplary Mathematica code that is configured to implement a Precision Analysis portion of method 200.
  • Figure 6A-C is a source listing 600 of exemplary Mathematica code that is configured to implement an Error Analysis portion of method 200.
  • Aff ⁇ ne Arithmetic is a variant of range arithmetic. It is based on the Interval Arithmetic for solving range problems.
  • Affrne Arithmetic can be used in areas such as computer graphics, analog circuit sizing, and floating point error modeling.
  • Aff ⁇ ne Arithmetic is used to find an optimized bit- width for the fixed-point implementation.
  • Method 200 includes generating 202 a Signal Flow Diagram of the operation performed in the software code to be converted.
  • the Signal Flow Diagram may be received from a storage device or a network.
  • the Signal Flow Diagram is a visual representation of the operation performed in the software code.
  • the range of the variable along the signal flow diagram is then determined using a range analysis 204.
  • the largest fractional bit width of the variables along the signal flow diagram with the fractional bit width of the final result being fixed is then determined using a precision analysis 206.
  • the numerical value lost with a particular fractional bit width is then determined using an error analysis 208.
  • An Operation Analysis 210 is then used to further fine tune the fractional bit width required based on the operation performed.
  • a fractional bit width to achieve a specified fixed fractional bit of the final result is then performed using precision analysis 206.
  • precision analysis 206 There are at least two main ways to quantize a signal: truncation and round-to-nearest. In the exemplary embodiment, round-to-nearest is used.
  • the Aff ⁇ ne Expression and Aff ⁇ ne Error Form of a signal x can be represented as follows:
  • a Mathematica program 400 shown in Figures 4A and 4B is based on the above Affine Error Inequality to perform Precision Analysis 206.
  • the number of fractional bit width computed from Precision Analysis 206 is based on the worst case analysis, for example, a maximum error.
  • Figure 5 is a graph 500 illustrating an exemplary plot-out of the Uniform Bit Width Affine Error Function.
  • Graph 500 includes an x-axis 502 graduated in units of a number of bits and a y-axis 504 graduated in units of a number of errors.
  • an approximately 8-bit precision based on Precision Analysis 206 is expected. However, if the error margin is allowed, it is possible to find the minimum point of the curve and reduce the number of fractional bits required.
  • the minimum bit width for example, 4-bits shown in Figure 5 can be obtained. This is a saving of approximately 50%, for example, from 8-bit to 4-bit.
  • a Mathematica program 600 shown in Figures 6A, 6B, and 6C performs the above described Error Analysis 208.
  • the fine-tuning process is a manual process wherein a user reviews the signal flow graph result from the error analysis and determines each operation that permits further reduce the bit-width.
  • the fine-tuning process is determined automatically by a processor (not shown in Figure 6).
  • the processor adjusts each fractional bit to a new value and determines the success at reducing the bit- width.
  • FIG. 7 is a schematic block diagram of an exemplary data converter system 700 for automatically converting floating point software code to fixed point software code.
  • system 700 includes a processor 702 configured to execute analysis code 704 such as described above with reference to Figures 3A, 3B, 4A, 4B, 6A, 6B, AND 6C.
  • Processor 702 is configured to receive floating point code 706, process floating point code 706 in accordance with analysis code 704, and generate fixed point code 708 capable of executing within predetermined parameters on a predetermined target processor such as an ASIC, an FPGA and/or an embedded processor such as a DSP.
  • a predetermined target processor such as an ASIC, an FPGA and/or an embedded processor such as a DSP.
  • processor refers to central processing units, microprocessors, microcontrollers, reduced instruction set circuits (RISC), application specific integrated circuits (ASIC), logic circuits, and any other circuit or processor capable of executing the functions described herein.
  • Memory 160 may include storage locations for the preset macro instructions that may be accessible using one of the plurality of preset switches 142.
  • the terms "software” and “firmware” are interchangeable, and include any computer program stored in memory for execution by processor 702, including RAM memory, ROM memory, EPROM memory, EEPROM memory, and non-volatile RAM (NVRAM) memory.
  • RAM memory random access memory
  • ROM memory read-only memory
  • EPROM memory erasable programmable read-only memory
  • EEPROM memory electrically erasable programmable read-only memory
  • NVRAM non-volatile RAM
  • the above-described embodiments of the disclosure may be implemented using computer programming or engineering techniques including computer software, firmware, hardware or any combination or subset thereof, wherein the technical effect is for a smooth automatic conversion of a floating point software code to a fixed-point software code implementable on for example, an ASIC, an FPGA and/or an embedded processor such as a DSP that minimizes the fixed-point bit size and maximizes the accuracy of the data processed by the resultant fixed-point software.
  • Any such resulting program, having computer-readable code means may be embodied or provided within one or more computer-readable media, thereby making a computer program product, i.e., an article of manufacture, according to the discussed embodiments of the disclosure.
  • the computer readable media may be, for example, but is not limited to, a fixed (hard) drive, diskette, optical disk, magnetic tape, semiconductor memory such as read-only memory (ROM), and/or any transmitting/receiving medium such as the Internet or other communication network or link.
  • the article of manufacture containing the computer code may be made and/or used by executing the code directly from one medium, by copying the code from one medium to another medium, or by transmitting the code over a network.
  • the above-described embodiments of a method and system for developing fixed-point software code from floating-point code provides a cost- effective and reliable means for a smooth conversion of a floating point software code to a fixed-point software code implementable on for example, an ASIC, an FPGA and/or an embedded processor such as a DSP. More specifically, the methods and systems described herein facilitate minimizing the fixed-point bit size and maximizing accuracy of the data processed by the resultant fixed-point software. In addition, the above-described methods and systems facilitate an automatic analytical method for the conversion of the floating point software code to a fixed-point software code. As a result, the methods and systems described herein facilitate automatically developing fixed-point software code from floating-point software code in a cost-effective and reliable manner.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

Method and system for a data converter configured to automatically generate fixed-point software code from floating-point code is provided. The data converter includes a signal flow representation of at least one of an algorithm and an application and a processor configured to receive the signal flow representation. The processor is configured to determine a range of at least one variable along the representation and determine a largest fractional bit width of the at least one variable. The processor is further configured to determine a numerical value lost with the determined fractional bit width and fine tune the fractional bit width required based on the operations to be performed. A method of converting a first set of program code into a second set of program code is also provided.

Description

METHOD AND SYSTEM FOR CREATING FIXED-POINT SOFTWARE CODE
BACKGROUND OF THE INVENTION
[0001] This invention relates generally to converting floating-point numeric representations to fixed-point numeric representations and, more particularly, to generating fixed-point software code from floating-point software code.
[0002] At least some known embedded systems incorporate high-complexity digital signal processing algorithms and/or applications. These algorithms and/or applications are typically designed or formulated with a high level programming language such as MATLAB (MathWork), CoCentric (Synopsys), or SPW (CoWare) and their performance is evaluated using floating-point based simulation. In actual implementation and use, however, such algorithms and/or applications typically are in fixed-point data type on an embedded processor, ASIC or FPGA. Such fixed-point numeric representation is utilized to facilitate, for example, avoiding complex practical implementation issues, such as computation overflow, and to speed up the algorithm development process. Fixed-point numeric representation also typically is utilized to facilitate portability across multiple platforms. In addition, for FPGA and ASIC designs, an optimized fixed-point implementation, e.g. the minimum bit-width required, is sought in order to facilitate reduced size, cost, as well as power consumption in comparison to less optimum implementations.
[0003] Manually converting the higher level floating-point algorithms and/or applications to a lower level "fixed-point" implementation can be both very time-consuming and error-prone.
[0004] At least some of the currently known automated, or partially automated, methodologies are based on the bit-true simulation of the fixed-point implementation method 100 as shown in Figure 1. In accordance with method 100 global constraints such as error margins and known variables fixed-point format are entered 102 into a system. With these constraints, a required fixed-point data format for other variables are determined 104 based on predetermined propagation rules. A bit-true simulation based on a huge amount of test vectors is then performed 106. If the result of the simulation does not satisfy 108 the constraints input, for example, the tolerable errors, the whole process repeats. Executing method 100 can be very time- consuming and may not converge if the constraints are not reachable.
BRIEF DESCRIPTION OF THE INVENTION
[0005] In one embodiment, a data converter configured to automatically generate fixed-point software code from floating-point code is provided. The data converter includes a processor configured to receive a signal flow representation. The processor is configured to determine a range of at least one variable in the signal flow representation and determine a largest fractional bit width of the at least one variable. The processor is further configured to determine a numerical value lost with the determined fractional bit width and "fine tune" the fractional bit width based on the operations to be performed.
[0006] In another embodiment, a method of converting a first set of program code into a second set of program code is provided. The first set of program code includes one or more floating-point arithmetic operations to be performed on one or more floating-point variables defined therein. The second set of program code includes one or more fixed-point arithmetic operations having been substituted for the one or more floating-point arithmetic operations. The method includes defining a representation of the operations to be performed in the first set of software code, determining a range of at least one variable along the representation, and determining a largest fractional bit width of the at least one variable. The method also includes determining a numerical value lost with the determined fractional bit width and "fine tuning" the fractional bit width required based on the operations to be performed.
[0007] In yet another embodiment, a software code converter includes a range analyzer configured to determine a range of at least one variable of an application embodied in software code to be converted, a precision analyzer configured to determine a largest fractional bit width of the at least one variable with the fractional bit width of the final result being fixed, an error analyzer configured to determine an error function along the range of the at least one variable, and an operations analyzer configured to fine-tune the fractional bit based on the operation performed and the fractional bit requirement in the result.
BRIEF DESCRIPTION OF THE DRAWINGS
[0008] Figure 1 is a flow chart of an exemplary method of a known bit-true simulation of the fixed-point implementation method;
[0009] Figure 2 is a flow chart of an exemplary method of converting floating point software code to a fixed-point implementation based on Affine Arithmetic;
[0010] Figure 3A-B is a source listing of exemplary Mathematica code that is configured to implement a Range Analysis portion of the method shown in Figure 2;
[0011] Figure 4A-B is a source listing of exemplary Mathematica code that is configured to implement a Precision Analysis portion of the method shown in Figure 2;
[0012] Figure 5 is a graph illustrating an exemplary plot-out of the Uniform Bit Width Affine Error Function.
[0013] Figure 6A-C is a source listing of exemplary Mathematica code that is configured to implement an Error Analysis portion of the method shown in Figure 2; and
[0014] Figure 7 is a schematic block diagram of an exemplary data converter system 700 for automatically converting floating point software code to fixed point software code. DETAILED DESCRIPTION OF THE INVENTION
[0015] The following detailed description illustrates embodiments of the invention by way of example and not by way of limitation. It is contemplated that the invention has general application to analytical and methodical methods of performing software code conversion that accelerate the software development cycle in industrial, commercial, and residential applications.
[0016] As used herein, an element or step recited in the singular and proceeded with the word "a" or "an" should be understood as not excluding plural elements or steps, unless such exclusion is explicitly recited. Furthermore, references to "one embodiment" of the present invention are not intended to be interpreted as excluding the existence of additional embodiments that also incorporate the recited features.
[0017] Figure 2 is a flow chart of an exemplary method 200 of converting floating point software code to a fixed-point implementation based on Affϊne Arithmetic. Figure 3A-B is a source listing 300 of exemplary Mathematica code that is configured to implement a Range Analysis 204 portion of method 200. Figure 4A-B is a source listing 400 of exemplary Mathematica code that is configured to implement a Precision Analysis portion of method 200. Figure 6A-C is a source listing 600 of exemplary Mathematica code that is configured to implement an Error Analysis portion of method 200. Affϊne Arithmetic is a variant of range arithmetic. It is based on the Interval Arithmetic for solving range problems. Affrne Arithmetic can be used in areas such as computer graphics, analog circuit sizing, and floating point error modeling. In the exemplary embodiment, Affϊne Arithmetic is used to find an optimized bit- width for the fixed-point implementation.
[0018] Method 200 includes generating 202 a Signal Flow Diagram of the operation performed in the software code to be converted. Alternatively, the Signal Flow Diagram may be received from a storage device or a network. In the exemplary embodiment, the Signal Flow Diagram is a visual representation of the operation performed in the software code. The range of the variable along the signal flow diagram is then determined using a range analysis 204. The largest fractional bit width of the variables along the signal flow diagram with the fractional bit width of the final result being fixed is then determined using a precision analysis 206. The numerical value lost with a particular fractional bit width is then determined using an error analysis 208. An Operation Analysis 210 is then used to further fine tune the fractional bit width required based on the operation performed.
[0019] Based on the Affine Arithmetic, for a signal x, over the range [Xm1n, xmax], the mid-point can be represented as x0 = (xmaχ + Xmm ) / 2 , the maximum variance can be represented as xi = (xmaχ - Xmm) / 2. The range can then be represented as [xo - X1 , x0 + X1], which can then be expressed as x = X0 +X1S1 where εi represents the uncertainties in x and has a value that lies between [-1,1]. This form of expression is also known as the Affine Form. From this, the Affine Form of the arithmetic operation can also be defined. The following representations give the Affine Form for addition, subtraction and multiplication.
Addition/Subtraction: % ± γ = (x0 ± y0 ) + ∑(x, ± ytt
* ι=l
Multiplication: jx y = xoyo + ∑(x0 yt + yoxtι + uvεk , where
Figure imgf000006_0001
[0020] An exemplary Mathematica program, shown in Figures 3A and 3B, is based on the Affine Form Arithmetic described above to perform Range Analysis 204.
[0021] A fractional bit width to achieve a specified fixed fractional bit of the final result is then performed using precision analysis 206. There are at least two main ways to quantize a signal: truncation and round-to-nearest. In the exemplary embodiment, round-to-nearest is used. The Affϊne Expression and Affϊne Error Form of a signal x can be represented as follows:
x = x + 2~FB*Λε E. = 2-FB'Λ , where
ε = [-1,1], and
FB i = fractional bit-width of x
[0022] The Affine Error Form for addition, subtraction and multiplication is as shown below:
E = E + E + 2~FBzΛε Addition/Subtraction: z x y
Multiplication: E2 = xE y + yEx + ExE y + 2~FB*Λ ε
[0023] For a particular signal flow diagram, after expanding out the Affine Error Form of the arithmetic operation and taking the worst case errors, the output error needs to be less than about 1 ulp (unit in last place), resulting in an inequality as shown in the example below:
2".EB 2-I < 2 -.FB 4 + 2~FB«+2 +2~FB " ~FBι-~2-\
[0024] In the exemplary embodiment, FBa = FBb = • • is considered for the uniform bit width solution. By solving the above inequality with the stated condition of uniform bit width, the resultant fractional bit width is obtained for the given error constraint. A Mathematica program 400, shown in Figures 4A and 4B is based on the above Affine Error Inequality to perform Precision Analysis 206.
[0025] In the exemplary embodiment, the number of fractional bit width computed from Precision Analysis 206 is based on the worst case analysis, for example, a maximum error.
[0026] Figure 5 is a graph 500 illustrating an exemplary plot-out of the Uniform Bit Width Affine Error Function. Graph 500 includes an x-axis 502 graduated in units of a number of bits and a y-axis 504 graduated in units of a number of errors. As shown in Figure 5, an approximately 8-bit precision based on Precision Analysis 206 is expected. However, if the error margin is allowed, it is possible to find the minimum point of the curve and reduce the number of fractional bits required. By taking the second derivative of the Uniform Bit Width Affine Error Function, the minimum bit width, for example, 4-bits shown in Figure 5, can be obtained. This is a saving of approximately 50%, for example, from 8-bit to 4-bit. A Mathematica program 600, shown in Figures 6A, 6B, and 6C performs the above described Error Analysis 208.
[0027] The fractional bit may be further fine -tuned using the operation performed and the fractional bit requirement in the result. For example, for an operation "z=x+y", after executing method 200, to have 3 fractional bits precision in z, there needs to be 4 fractional bits for x and y . However, if the round-off error is accommodated, only 3 fractional bits are needed for x and y. Similarly, for multiplication, if a 3 fractional bits precision in the result is required, only 2 fractional bits precision for the x and y are needed. For example,
[0028] For fixed-point addition, the format is as follows:
M . N + M . N
M . N, where M is the number of real-number bit and N is the number of fractional bit; hence (M+N) is the total number of bit required for the floating-point number. If three fractional bits are required in the result, i.e. N=3 for the result, then in order to ensure accuracy, the fractional bit for the two inputs needs to be at least 4-bit, i.e. N=4 for the two addends. This is because if the last bits of the two inputs are 1, then a carry over to the second last bit will occur in the addition result. However, the last bit of the addition is discarded because only three fractional bits are required. If in a situation where omitting the carry over contributes less than a predetermined amount to the error margin, only three fractional bits for the addition's input may be used.
[0029] For fixed-point multiplication, the format is as follows: M . N M . N
(M+M) . (N+N)
Therefore, if three fractional bits is required for the multiplication result, only two fractional bits for the input is required because the two fractional bits result in four fractional bits in the output and it is more than the three fractional bit requirement.
[0030] In the exemplary embodiment, the fine-tuning process is a manual process wherein a user reviews the signal flow graph result from the error analysis and determines each operation that permits further reduce the bit-width. In an alternative embodiment, the fine-tuning process is determined automatically by a processor (not shown in Figure 6). In one embodiment, the processor adjusts each fractional bit to a new value and determines the success at reducing the bit- width.
[0031] Figure 7 is a schematic block diagram of an exemplary data converter system 700 for automatically converting floating point software code to fixed point software code. In the exemplary embodiment, system 700 includes a processor 702 configured to execute analysis code 704 such as described above with reference to Figures 3A, 3B, 4A, 4B, 6A, 6B, AND 6C. Processor 702 is configured to receive floating point code 706, process floating point code 706 in accordance with analysis code 704, and generate fixed point code 708 capable of executing within predetermined parameters on a predetermined target processor such as an ASIC, an FPGA and/or an embedded processor such as a DSP.
[0032] The term processor, as used herein, refers to central processing units, microprocessors, microcontrollers, reduced instruction set circuits (RISC), application specific integrated circuits (ASIC), logic circuits, and any other circuit or processor capable of executing the functions described herein. Memory 160 may include storage locations for the preset macro instructions that may be accessible using one of the plurality of preset switches 142.
[0033] As used herein, the terms "software" and "firmware" are interchangeable, and include any computer program stored in memory for execution by processor 702, including RAM memory, ROM memory, EPROM memory, EEPROM memory, and non-volatile RAM (NVRAM) memory. The above memory types are exemplary only, and are thus not limiting as to the types of memory usable for storage of a computer program.
[0034] As will be appreciated based on the foregoing specification, the above-described embodiments of the disclosure may be implemented using computer programming or engineering techniques including computer software, firmware, hardware or any combination or subset thereof, wherein the technical effect is for a smooth automatic conversion of a floating point software code to a fixed-point software code implementable on for example, an ASIC, an FPGA and/or an embedded processor such as a DSP that minimizes the fixed-point bit size and maximizes the accuracy of the data processed by the resultant fixed-point software. Any such resulting program, having computer-readable code means, may be embodied or provided within one or more computer-readable media, thereby making a computer program product, i.e., an article of manufacture, according to the discussed embodiments of the disclosure. The computer readable media may be, for example, but is not limited to, a fixed (hard) drive, diskette, optical disk, magnetic tape, semiconductor memory such as read-only memory (ROM), and/or any transmitting/receiving medium such as the Internet or other communication network or link. The article of manufacture containing the computer code may be made and/or used by executing the code directly from one medium, by copying the code from one medium to another medium, or by transmitting the code over a network.
[0035] The above-described embodiments of a method and system for developing fixed-point software code from floating-point code provides a cost- effective and reliable means for a smooth conversion of a floating point software code to a fixed-point software code implementable on for example, an ASIC, an FPGA and/or an embedded processor such as a DSP. More specifically, the methods and systems described herein facilitate minimizing the fixed-point bit size and maximizing accuracy of the data processed by the resultant fixed-point software. In addition, the above-described methods and systems facilitate an automatic analytical method for the conversion of the floating point software code to a fixed-point software code. As a result, the methods and systems described herein facilitate automatically developing fixed-point software code from floating-point software code in a cost-effective and reliable manner.
[0036] While the disclosure has been described in terms of various specific embodiments, it will be recognized that the disclosure can be practiced with modification within the spirit and scope of the claims.

Claims

WHAT IS CLAIMED IS:
1. A data converter configured to generate fixed-point software code from floating-point code, said data converter comprising:
a processor configured to receive a signal flow representation, said processor further configured to:
determine a range of at least one variable along the representation;
determine a largest fractional bit width of the at least one variable;
determine a numerical value lost with the determined fractional bit width; and
fine tune the fractional bit width required based on the operations to be performed.
2. A data converter in accordance with Claim 1 wherein said processor is further configured to optimize a range of the integer bit width and to optimize a precision of the fractional bit width of the fixed-point software code using affine arithmetic.
3. A data converter in accordance with Claim 1 wherein said processor is further configured to facilitate reducing the fractional bit width size using an error function of the output error as a function of the fraction bit width of the at least one variable.
4. A method of converting a first set of program code into a second set of program code, the first set of program code comprising one or more floating-point arithmetic operations to be performed on one or more floating-point variables defined therein, the second set of program code having one or more fixed- point arithmetic operations having been substituted for said one or more floating-point arithmetic operations, said method comprising: defining a representation of the operations to be performed in the first set of software code;
determining a range of at least one variable along the representation;
determining a largest fractional bit width of the at least one variable;
determining a numerical value lost with the determined fractional bit width; and
fine tuning the fractional bit width required based on the operations to be performed.
5. A method in accordance with Claim 4 wherein determining a largest fractional bit width of the at least one variable comprises determining the largest fractional bit width of the at least one variable with the fractional bit width of the final result being fixed.
6. A method in accordance with Claim 4 wherein defining a representation of the operations to be performed comprises generating a signal flow diagram of the operations to be performed.
7. A method in accordance with Claim 4 wherein determining a range of at least one variable comprises determining a minimum integer bit width for the at least one variable using affϊne arithmetic.
8. A method in accordance with Claim 4 wherein determining a range of at least one variable comprises performing range analysis of each of the at least one variables.
9. A method in accordance with Claim 4 wherein determining a largest fractional bit width of the at least one variable comprises quantizing the range of the signal using a round-to-nearest technique:
10. A method in accordance with Claim 4 wherein determining a largest fractional bit width of the at least one variable comprises quantizing the range of the signal using:
x = x + 2~FBχA ε , where
FB £ = fractional bit-width of x , and ε = [-l,l]
11. A method in accordance with Claim 4 wherein determining a largest fractional bit width of the at least one variable comprises determining an error function along the range of the at least one variable using:
E. = TFKΛ , where
FB £ = fractional bit-width of x
12. A method in accordance with Claim 11 further comprising determining a uniform bit width affine error function from the determined error function using worst case error assumptions.
13. A method in accordance with Claim 12 further comprising determining a minimum bit width of the fractional bit width using a second derivative of the uniform bit width affine error function.
14. A software code converter comprising: a range analyzer configured to determining a range of at least one variable of an application embodied in software code to be converted; a precision analyzer configured to determine a largest fractional bit width of the at least one variable with the fractional bit width of the final result being fixed; an error analyzer configured to determine an error function along the range of the at least one variable; and an operations analyzer configured to fine-tune the fractional bit based on the operation performed and the fractional bit requirement in the result.
15. A software code converter in accordance with Claim 14 wherein said software code converter is configured to receive a representation of the operations to be performed by the software to be converted.
16. A software code converter in accordance with Claim 14 wherein said software code converter is configured to generate a signal flow diagram of the operations to be performed by the software to be converted.
17. A software code converter in accordance with Claim 14 wherein said range analyzer is configured to determine a minimum integer bit width for the at least one variable using affine arithmetic.
18. A software code converter in accordance with Claim 14 wherein said precision analyzer is configured to quantize the range of the signal using a round-to-nearest technique:
19. A software code converter in accordance with Claim 14 wherein said precision analyzer is configured to quantize the range of the signal using:
x = x + 2~FBχA ε , where
FB £ = fractional bit-width of x , and ε = [-l,l]
20. A software code converter in accordance with Claim 14 wherein said error analyzer is configured to determine an error function along the range of the at least one variable using:
E. = TFKΛ , where
FB £ = fractional bit-width of x
21. A software code converter in accordance with Claim 20 wherein said error analyzer is configured to determine a uniform bit width affine error function from the determined error function using worst case error assumptions.
22. A software code converter in accordance with Claim 21 wherein said error analyzer is configured to determine a minimum bit width of the fractional bit width using a second derivative of the uniform bit width affine error function.
PCT/US2008/056529 2008-03-11 2008-03-11 Method and system for creating fixed-point software code WO2009017849A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2008/056529 WO2009017849A1 (en) 2008-03-11 2008-03-11 Method and system for creating fixed-point software code

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2008/056529 WO2009017849A1 (en) 2008-03-11 2008-03-11 Method and system for creating fixed-point software code

Publications (1)

Publication Number Publication Date
WO2009017849A1 true WO2009017849A1 (en) 2009-02-05

Family

ID=40304700

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/056529 WO2009017849A1 (en) 2008-03-11 2008-03-11 Method and system for creating fixed-point software code

Country Status (1)

Country Link
WO (1) WO2009017849A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103336432A (en) * 2013-07-19 2013-10-02 蒲亦非 Fractional order self-adaptation signal processor based on fractional order steepest descent method
CN107038016A (en) * 2017-03-29 2017-08-11 广州酷狗计算机科技有限公司 A kind of floating number conversion method and device based on GPU
CN107045494A (en) * 2017-05-08 2017-08-15 科大讯飞股份有限公司 Improve the method and system of floating-point matrix operation efficiency

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5359551A (en) * 1989-06-14 1994-10-25 Log Point Technologies, Inc. High speed logarithmic function generating apparatus
US6691301B2 (en) * 2001-01-29 2004-02-10 Celoxica Ltd. System, method and article of manufacture for signal constructs in a programming language capable of programming hardware architectures
US20050065990A1 (en) * 2003-09-17 2005-03-24 Catalytic, Inc. Emulation of a fixed point operation using a corresponding floating point operation
US20050116955A1 (en) * 2003-11-25 2005-06-02 Canon Kabushiki Kaisha Pixel accurate edges for scanline rendering system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5359551A (en) * 1989-06-14 1994-10-25 Log Point Technologies, Inc. High speed logarithmic function generating apparatus
US6691301B2 (en) * 2001-01-29 2004-02-10 Celoxica Ltd. System, method and article of manufacture for signal constructs in a programming language capable of programming hardware architectures
US20050065990A1 (en) * 2003-09-17 2005-03-24 Catalytic, Inc. Emulation of a fixed point operation using a corresponding floating point operation
US20050116955A1 (en) * 2003-11-25 2005-06-02 Canon Kabushiki Kaisha Pixel accurate edges for scanline rendering system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103336432A (en) * 2013-07-19 2013-10-02 蒲亦非 Fractional order self-adaptation signal processor based on fractional order steepest descent method
CN107038016A (en) * 2017-03-29 2017-08-11 广州酷狗计算机科技有限公司 A kind of floating number conversion method and device based on GPU
CN107045494A (en) * 2017-05-08 2017-08-15 科大讯飞股份有限公司 Improve the method and system of floating-point matrix operation efficiency

Similar Documents

Publication Publication Date Title
JP7056225B2 (en) Arithmetic processing unit, information processing unit, information processing method, and program
JP4571903B2 (en) Arithmetic processing apparatus, information processing apparatus, and arithmetic processing method
CN105573715B (en) It responds instruction execution and is rounded operation
WO2002023326A1 (en) Handler for floating-point denormalized numbers
JP4500358B2 (en) Arithmetic processing apparatus and arithmetic processing method
JP2001092636A (en) Device and method for processing data for applying floating point arithmetic to first, second and third operand
CN116643718B (en) Floating point fusion multiply-add device and method of pipeline structure and processor
CN104778028A (en) Multiply adder
CN107038014B (en) Rounding an inverse square root result
KR102570304B1 (en) Apparatus and method for controlling rounding when performing a floating point operation
US20090049417A1 (en) Method of designing a circuit for optimizing output bit length and integrated circuit therefor
CN112130803A (en) Floating-point dot-product arithmetic unit with correct rounding
WO2009017849A1 (en) Method and system for creating fixed-point software code
US20090300087A1 (en) Computation processor, information processor, and computing method
JP5682080B2 (en) Optimal system accuracy definition algorithm guided by architecture for custom integrated circuits
CN112130804A (en) Fused multiply-add operator with correctly rounded mixed-precision floating-point numbers
Ghosh et al. FPGA based implementation of a double precision IEEE floating-point adder
JP2010218197A (en) Floating point product sum arithmetic operation device, floating point product sum arithmetic operation method, and program for floating point product sum arithmetic operation
KR101084581B1 (en) Method and Apparatus for Operating of Fixed-point Exponential Function, and Recording Medium thereof
KR100974190B1 (en) Complex number multiplying method using floating point
CN117648959B (en) Multi-precision operand operation device supporting neural network operation
US20230334117A1 (en) Method and system for calculating dot products
WO2023004799A1 (en) Electronic device and neural network quantization method
CN117270813A (en) Arithmetic unit, processor, and electronic apparatus
CN112214196A (en) Floating point exception handling method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08743776

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC

122 Ep: pct application non-entry in european phase

Ref document number: 08743776

Country of ref document: EP

Kind code of ref document: A1