CN113312024A - Option pricing calculation hardware accelerator, accelerator card and computer equipment - Google Patents

Option pricing calculation hardware accelerator, accelerator card and computer equipment Download PDF

Info

Publication number
CN113312024A
CN113312024A CN202110674306.5A CN202110674306A CN113312024A CN 113312024 A CN113312024 A CN 113312024A CN 202110674306 A CN202110674306 A CN 202110674306A CN 113312024 A CN113312024 A CN 113312024A
Authority
CN
China
Prior art keywords
option
multiplier
output
monte carlo
hardware accelerator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110674306.5A
Other languages
Chinese (zh)
Other versions
CN113312024B (en
Inventor
黎渊
戴艺
陆平静
欧洋
常俊胜
孙岩
张建民
徐金波
罗章
王子聪
熊泽宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202110674306.5A priority Critical patent/CN113312024B/en
Publication of CN113312024A publication Critical patent/CN113312024A/en
Application granted granted Critical
Publication of CN113312024B publication Critical patent/CN113312024B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/57Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
    • G06F7/575Basic arithmetic logic units, i.e. devices selectable to perform either addition, subtraction or one of several logical operations, using, at least partially, the same circuitry
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/58Random or pseudo-random number generators
    • G06F7/588Random number generators, i.e. based on natural stochastic processes

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an option pricing calculation hardware accelerator, an accelerator card and computer equipment, wherein the option pricing calculation hardware accelerator comprises a Gaussian random number generator, a multiplier M2, an adder a2, an EXP module, a multiplexer MUXA, a multiplexer MUXB, a multiplier M5, a subtracter s0, a comparator, a delay module delay, an accumulator, a shifter and a multiplication array, and through the combination of the components, a Monte Carlo iteration path can be simulated in M clock cycles, so that one option execution price SM can be predicted. To reduce hardware resource consumption and circuit complexity, 64-bit floating point operations are converted to fixed point operations; the accelerator card is a card comprising the hardware accelerator, and the computer equipment is provided with the option pricing calculation hardware accelerator. The invention can realize no pause in the accelerated realization process of Monte Carlo option pricing calculation hardware and full hydration in the whole calculation process, and has better performance and energy efficiency ratio compared with the realization of a CPU and a GPU under the same process.

Description

Option pricing calculation hardware accelerator, accelerator card and computer equipment
Technical Field
The invention relates to a hardware acceleration technology of Monte Carlo option pricing, in particular to an option pricing calculation hardware accelerator, an accelerator card and computer equipment.
Background
Monte Carlo option pricing is an existing software algorithm, and as shown in FIG. 1, the calculation process of Monte Carlo option pricing mainly includes two cycles: the inner loop (8 th to 11 th rows) simulates a primary random prediction path of the option price; the outer loop (lines 6-15) calculates and accumulates the proceeds from all paths, then averages the sum of the proceeds and discounts (lines 16-17) to get the predicted option price. Aiming at the software algorithm model of the Monte Carlo option pricing, the hardware accelerator is realized by adopting the FPGA, and the calculation efficiency of the Monte Carlo option pricing is expected to be improved. However, currently, the implementation of the hardware accelerator by using the FPGA is expected to improve the monte carlo option pricing to directly convert the software algorithm into the hardware accelerator, and the hardware accelerator generated in this way still has a large amount of optimization space and has the problems of insufficient energy efficiency ratio and performance.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: aiming at the problems in the prior art, the invention provides an option pricing calculation hardware accelerator, an accelerator card and computer equipment, which can realize no pause in the Monte Carlo option pricing calculation hardware acceleration realization process and full hydration in the whole calculation process, and have better energy efficiency ratio and performance compared with the CPU and GPU realization under the same process.
In order to solve the technical problems, the invention adopts the technical scheme that:
an option pricing calculation hardware accelerator comprising a first circuit unit for M time slices to complete a monte carlo simulation of an option execution price, SM, the first circuit unit comprising:
a Gaussian random number generator for generating a Gaussian random number z;
a multiplier M2, configured to multiply an input parameter sigqrdt by a gaussian random number z, where a calculation function expression of the parameter sigqrdt is sigma sqrt (T/M), where sigma is a preset option price fluctuation rate, T is a preset option validity period, and M is a preset number of time slices simulated by each monte carlo iteration;
an adder a2 for summing an input parameter drift and an output of the multiplier M2, wherein the calculated function expression of the parameter drift is (r-0.5 sigma) sigma (T/M), where r is a preset risk-free rate;
the EXP module is used for carrying out exponential operation on the output of the adder a 2;
a multiplexer MUXA for selecting the option initial prices S0, 1 and the multiplication result acc _ sm output by the multiplier M5, the multiplexer MUXA selects the option initial price S0 at the (dg + dm2+ da2+ de) th clock cycle after reset to wait for the first gaussian random number z0 to enter the multiplier M5, the multiplexer MUXA selects the option initial price S0 next cycle, then selects the constant 1 in the next 3 cycles, selects the multiplication result acc _ sm output by the multiplier M5 next M-4 cycles, and these three selection operations repeat I times to simulate all monte carlo simulation paths, where I is the preset number of monte carlo iterations; where dg denotes the number of delayed clock cycles of the gaussian random number generator, dm2 denotes the number of delayed clock cycles of multiplier m2, da2 denotes the number of delayed clock cycles of adder a2, and de denotes the number of delayed clock cycles of the EXP module;
the multiplier m5 is used for multiplying the output of the EXP module and the output of the multiplexer MUXA to obtain a multiplication result acc _ sm and outputting the multiplication result acc _ sm to the multiplication array;
and the multiplication array is used for multiplying 4 multiplication intermediate results which are generated after M time slice simulations of one random path are completed and distributed in a four-stage pipeline of the M5 multiplier, and outputting a final option price SM predicted by Monte Carlo iteration after 3+2dma beats.
Optionally, the multiplication array comprises:
the multiplier ma0 is used for multiplying the result obtained by registering the output of the multiplier m5 and the output of the multiplier m5 through a primary register;
the multiplier ma1 is used for multiplying the results obtained after the output of the multiplier m5 is respectively registered by a two-stage register and a three-stage register;
and the multiplier ma2 is used for multiplying the outputs of the multipliers ma0 and ma1 to obtain the final option price SM of the monte carlo iterative prediction.
Optionally, the multiplication array further includes an output terminal of the available signal valid, and the condition for generating the available signal valid of the multiplication array is that the (dg + dm2+ da2+ de + M +3+2dma) th clock cycle after reset is valid and then valid for one beat per M time slices, where dg represents the number of delay clock cycles of the gaussian random number generator, dm2 represents the number of delay clock cycles of multiplier M2, da2 represents the number of delay clock cycles of adder a2, de represents the number of delay clock cycles of the EXP module, M represents the number of simulated time slices per monte carlo iteration, the number of delay clock cycles of multiplier ma0, multiplier ma1 and multiplier ma2 are dma, and 2dma represents that the number of delay clock cycles is twice dma.
Optionally, the gaussian random number generator comprises:
a uniformly distributed random number generation module for generating uniformly distributed random numbers URNs using a WELL19937 method;
Box-Muller for converting uniformly distributed random numbers URNs to gaussian random numbers z using Box-Muller method.
Optionally, a second circuit unit for processing an external loop data path of the monte carlo option pricing calculation is further connected to the pipeline output end of the first circuit unit, and the second circuit unit includes:
a subtraction unit s0, configured to subtract the option price SM and the right price strike of the option predicted by the monte carlo iteration at this time;
a comparator for comparing 0 with the output of the subtraction unit s0 and outputting a control signal;
a multiplexer MUXB for selecting 0 or the output of the subtracting unit s0 as its output according to the output of the comparator complerisor, and the selection condition is that if the output of the subtracting unit s0 is greater than or equal to 0, the comparator complerisor outputs 1, the multiplexer MUXB selects the output of the subtracting unit s0 for output, otherwise the multiplexer MUXB selects 0 for output;
a delayer delay for generating an enable signal en according to an available signal valid of the multiplication array;
and the accumulator is used for accumulating the output of the multiplexer MUXB under the control of the enable signal en to obtain an accumulated value sum _ payoff of the option benefits predicted by all Monte Carlo simulation paths.
Optionally, the pipeline output end of the second circuit unit is further connected to a third circuit unit for generating a final option estimation price according to the accumulated value sum _ payoff of the option proceeds predicted by all monte carlo simulation paths.
Optionally, the third circuit unit includes:
the shift unit is used for shifting the accumulated value sum _ payoff of the option benefits predicted by all Monte Carlo simulation paths to realize division;
and the multiplier m7 is configured to multiply the output of the shift unit and the discount rate ert to obtain a final benefit payoff, where a calculation function expression of the discount rate ert is ert ═ exp (-r × T), r is a preset risk-free interest rate, and T is a preset option validity period.
In addition, the invention also provides a hardware accelerator card, which comprises an accelerator card body and an accelerator chip arranged on the accelerator card body, wherein the accelerator chip is used for calculating the hardware accelerator for the option pricing.
In addition, the invention also provides computer equipment which comprises a mainboard provided with a microprocessor and a memory which are connected with each other, and is characterized by also comprising the option pricing calculation hardware accelerator, wherein the microprocessor and the option pricing calculation hardware accelerator are in communication connection.
Optionally, the option pricing calculation hardware accelerator is integrated on the motherboard, or the option pricing calculation hardware accelerator is installed on the motherboard in a manner of a board card in an inserting manner.
Compared with the prior art, the invention has the following advantages:
1. the invention comprises a Gaussian random number generator, a multiplier M2, an adder a2, an EXP module, a multiplexer MUXA, a multiplexer MUXB, a multiplier M5, a subtracter s0, a comparator, a delay module delay, an accumulator, a shifter and a multiplication array, through the combination of the above components, Monte Carlo simulation of one option execution price SM can be completed in M time slices, 64-bit floating point operation is converted into fixed point operation, the efficiency of accelerated calculation can be effectively improved, the hardware accelerated realization process of Monte Carlo option pricing calculation has no pause, the total hydration of the whole calculation process has better energy efficiency ratio and performance compared with the realization of a CPU and a GPU under the same process.
2. The multiplication array is used for multiplying 4 multiplication intermediate results which are generated after M time slice simulations of a random path are completed and distributed in a four-stage pipeline of an M5 multiplier, and outputting a final option price SM which is subjected to Monte Carlo iterative prediction after 3+2dma beats, so that for the 4 multiplication intermediate results which are generated after the M time slice simulations of the random path and distributed in the four-stage pipeline of the M5 multiplier, the generated 4 multiplication intermediate results are distributed in the four-stage pipeline of the M5 multiplier, and by introducing the multiplication array, the four intermediate results sequentially enter the multiplication array, and the final multiplication result is obtained from an output end of the multiplication array after 3+2dma beats, so that the simulation process is not halted, and the whole calculation process is fully hydrated.
3. The invention converts 64-bit floating point operation into fixed point operation, which can effectively improve the efficiency of accelerated calculation.
Drawings
Fig. 1 is a pseudo code diagram of a software implementation of a conventional monte carlo option pricing algorithm.
Fig. 2 is a schematic circuit diagram of an option pricing calculation hardware accelerator according to an embodiment of the present invention.
Fig. 3 is a diagram illustrating comparison of performance and performance of option pricing calculation hardware accelerators according to an embodiment of the present invention.
Detailed Description
As shown in fig. 2, the option pricing calculation hardware accelerator of the embodiment includes a first circuit unit for completing monte carlo simulation of the option execution price SM once in M time slices, and the first circuit unit includes:
a Gaussian random number generator for generating a Gaussian random number z;
a multiplier M2, configured to multiply an input parameter sigqrdt by a gaussian random number z, where a calculation function expression of the parameter sigqrdt is sigma sqrt (T/M), where sigma is a preset option price fluctuation rate, T is a preset option validity period, and M is a preset number of time slices simulated by each monte carlo iteration;
an adder a2 for summing an input parameter drift and an output of the multiplier M2, wherein the calculated function expression of the parameter drift is (r-0.5 sigma) sigma (T/M), where r is a preset risk-free rate;
the EXP module is used for carrying out exponential operation on the output of the adder a 2;
a multiplexer MUXA for selecting the option initial prices S0, 1 and the multiplication result acc _ sm output by the multiplier M5, the multiplexer MUXA selects the option initial price S0 at the (dg + dm2+ da2+ de) th clock cycle after reset to wait for the first gaussian random number z0 to enter the multiplier M5, the multiplexer MUXA selects the option initial price S0 next cycle, then selects the constant 1 in the next 3 cycles, selects the multiplication result acc _ sm output by the multiplier M5 next M-4 cycles, and these three selection operations repeat I times to simulate all monte carlo simulation paths, where I is the preset number of monte carlo iterations; where dg denotes the number of delayed clock cycles of the gaussian random number generator, dm2 denotes the number of delayed clock cycles of multiplier m2, da2 denotes the number of delayed clock cycles of adder a2, and de denotes the number of delayed clock cycles of the EXP module;
the multiplier m5 is used for multiplying the output of the EXP module and the output of the multiplexer MUXA to obtain a multiplication result acc _ sm and outputting the multiplication result acc _ sm to the multiplication array;
and the multiplication array is used for multiplying 4 multiplication intermediate results which are generated after M time slice simulations of one random path are completed and distributed in a four-stage pipeline of the M5 multiplier, and outputting a final option price SM predicted by Monte Carlo iteration after 3+2dma beats.
It should be noted that the bold characters or numbers in fig. 2 indicate the number of delayed clock cycles of the corresponding components, and (x, y) indicate the bit width value of the corresponding operand, where x indicates the bit width of the integer part and y indicates the bit width of the fractional part.
As shown in fig. 2, the multiplication array includes:
the multiplier ma0 is used for multiplying the result obtained by registering the output of the multiplier m5 and the output of the multiplier m5 through a primary register;
the multiplier ma1 is used for multiplying the results obtained after the output of the multiplier m5 is respectively registered by a two-stage register and a three-stage register;
and the multiplier ma2 is used for multiplying the outputs of the multipliers ma0 and ma1 to obtain the final option price SM of the monte carlo iterative prediction.
As shown in fig. 2, the multiplication array further includes an output terminal for the valid signal, and the condition for generating the valid signal of the multiplication array is that the (dg + dm2+ da2+ de + M +3+2dma) th clock cycle after reset is valid and then valid for one beat every M time slices, where dg represents the number of delay clock cycles of the gaussian random number generator, dm2 represents the number of delay clock cycles of the multiplier M2, da2 represents the number of delay clock cycles of the adder a2, de represents the number of delay clock cycles of the EXP module, M represents the number of simulated time slices per monte carlo iteration, the number of delay clock cycles of the multipliers ma0, ma1 and ma2 are dma, and 2dma represents that the number of delay clock cycles is twice dma.
As shown in fig. 2, the gaussian random number generator includes:
a uniformly distributed random number generation module for generating uniformly distributed random numbers URNs using a WELL19937 method;
Box-Muller for converting uniformly distributed random numbers URNs to gaussian random numbers z using Box-Muller method.
The Gaussian random number generator in this embodiment employs the WELL19937 algorithm as a uniformly distributed random number generator, which has proven to be the most advanced of the current uniformly distributed random number generators, is capable of generating the highest quality random numbers, and has a219937The introduction of the algorithm ensures the correctness of the simulation result, and the BM method is selected in this embodiment because it can generate a completely accurate gaussian sample. Furthermore, unlike reject methods that include if-else conditions in the datapath (e.g., Ziggurat and Monty-Python), the BM has a fixed datapath that ensures that GRNs are available every clock cycle. Similarly, the WELL19937 algorithm has been shown to have perfect distribution characteristics, and can generate extremely long period of 219937The URNs of (1). These ensure the quality of the converted GRNs and the correctness of the final system.
Referring to fig. 2, the execution stage of the first circuit unit in this embodiment includes a first stage and a second stage of the hardware structure of the full pipeline, the first stage is the execution stage of the gaussian random number generator, and the second stage is the execution stage of the first circuit unit. In the first stage, a Gaussian Random Number Generator (GRNG) is used to generate Gaussian Random Numbers (GRNs) to simulate the wiener process. First, Uniform Random Numbers (URNs) were generated using the WELL19937 method. It is then converted to a Gaussian random number z by the Box-Muller (BM) method. In the second stage, Monte Carlo simulation of option execution price SM is completed in M time slices, namely acceleration of inner loop in software algorithm is completed. (line 8 to line 11). To reduce complexity, we first extend the price volatility formula (line 10) to a two-input static single-valued intermediate representation containing only basic operations. These operations are mapped into two multiplications, one addition, and one EXP block. The parameter near each module is the number of clock cycles that the module needs to complete the corresponding calculation. The multiplexer MUXA controls the computation flow, which selects the option initial price S0 at the (dg + dm2+ da2+ de) th clock cycle after system start-up to wait for the first gaussian variable z0 to enter multiplier m 5. The multiplexer MUXA then selects the epoch initial price S0 in the next cycle, constant 1 in the next 3 cycles, and the multiply result acc _ sm output by multiplier M5 in the next M-4 cycle. These three selection operations are repeated I times to simulate all monte carlo iteration paths. At every iteration mth clock cycle, the accumulated SM will be distributed in the four-stage pipeline of multiplier M5. These four multipliers will then be forwarded in turn to the multiplier array to obtain the option price SM predicted for this monte carlo iteration. The entire logic of the second stage is full-flow, and when the pipeline is full, an analog path of option prices can be calculated within M time slices.
As shown in fig. 2, in this embodiment, the pipeline output end of the first circuit unit is further connected to a second circuit unit of the external loop data path for processing monte carlo option pricing calculation, and the second circuit unit includes:
a subtraction unit s0, configured to subtract the option price SM and the right price strike of the option predicted by the monte carlo iteration at this time;
a comparator for comparing 0 with the output of the subtraction unit s0 and outputting a control signal;
the multiplexer MUXB is used for selecting 0 or the output of the subtraction unit s0 as the output of the multiplexer according to the output of the comparator complerisor, and the selection condition is that if the output of the subtraction unit s0 is greater than or equal to 0, the comparator complerisor outputs 1, the multiplexer MUXB selects the output of the subtraction unit s0 to output, otherwise, the multiplexer MUXB selects 0 to output;
a delayer delay for generating an enable signal en according to an available signal valid of the multiplication array;
and the accumulator is used for accumulating the output of the multiplexer MUXB under the control of the enable signal en to obtain an accumulated value sum _ payoff of the option benefits predicted by all Monte Carlo simulation paths.
Referring to fig. 2, the execution stage of the second circuit unit in this embodiment is the third stage of the hardware structure of the full pipeline.
The third stage is to process the data path of the outer loop (lines 6 to 15), where the earnings are calculated and accumulated from the option prices SM predicted this monte carlo iteration obtained in the second stage. The subtraction unit s0 calculates the option price SM of the current monte carlo iterative prediction minus strike _ price. Then, under the condition that the option price SM predicted by the Monte Carlo iteration of the time is greater than or equal to strike _ price, the calculation result is forwarded to the accumulator through the multiplexer MUXB. The enable signal en is connected to the enable signal of the accumulator. And a delay is added to ensure that the available signal valid of the multiplier array arrives at the accumulator at the same time as the output of the multiplexer MUXB arrives at the accumulator.
As shown in fig. 2, the pipeline output end of the second circuit unit in this embodiment is further connected to a third circuit unit for generating a final option estimated price according to the accumulated value sum _ payoff of the option proceeds predicted by all monte carlo simulation paths.
As shown in fig. 2, the third circuit unit in the present embodiment includes:
the shift unit is used for shifting the accumulated value sum _ payoff of the option benefits predicted by all Monte Carlo simulation paths to realize division;
and the multiplier m7 is configured to multiply the output of the shift unit and the discount rate ert to obtain a final benefit payoff, where a calculation function expression of the discount rate ert is ert ═ exp (-r × T), r is a preset risk-free interest rate, and T is a preset option validity period.
Referring to fig. 2, the execution stage of the third circuit unit in this embodiment is the fourth stage of the hardware structure of the full pipeline. The fourth stage generates the final option forecast price. It consists of two operations, one division, one multiplication, performing averaging and rendering respectively. To reduce complexity, we set the number of iterations I to a power of 2, so that a simple shift unit can be used to implement the division.
In the embodiment, 64-bit floating point operation is converted into fixed point operation, so that the running performance of the hardware accelerator can be improved, hardware resources consumed by the hardware accelerator are reduced, and the design complexity is reduced. In addition, as an optional implementation manner, in this embodiment, operand bit width in the hardware accelerator structure is optimized, and a bit width search of the whole system is performed through a simulated annealing algorithm, so that an operand bit width optimization result (x, y) of each component is obtained as follows:
the bit width value of the gaussian random number z input by the multiplier m2 is (5,19), and the bit width value of the parameter sigqrdt is (0, 30);
the bit width value of the parameter drift input by the adder a2 is (5,24), and the bit width value of the output result of the multiplier m2 is (5, 24);
the EXP module inputs the adder a2 and outputs the result with the bit width value of (5, 24);
the bit width value of the output result of the EXP module input by the multiplier m5 is (2,14), and the bit width value of the output result of the multiplexer MUXA is (16, 16);
the operation digit width values input by the multipliers ma0, ma1 and ma2 are (12, 12);
the bit width value of the operation input from the subtraction unit s0 is (16,4), and the bit width value of the output result is (16, 4);
the bit width value of the output result of the accumulator is (40, 4);
the bit width value of the output result of the shift unit is (17, 7);
the bit width value of the chip rate ert input to the multiplier m7 is (2,6), and the bit width value of the output result is (17, 13).
The structure after bit width optimization reduces hardware area overhead and hardware complexity while ensuring calculation accuracy. Compared with the traditional greedy algorithm, the greedy algorithm only accepts a more optimal solution as a next search state when the whole-system bit width search is carried out, and the simulated annealing algorithm probably accepts a worse solution probabilistically, so that a local optimal solution can be skipped to obtain a global optimal solution.
It should be noted that the parameters related in fig. 1 may be pre-calculated in advance and then written into the corresponding registers, or may be temporarily calculated after the parameters are directly input, which may be selected as needed.
Fig. 3 is a performance comparison structure of the option pricing calculation hardware accelerator implemented in this embodiment and the monte carlo option pricing calculation implemented by the CPU and the GPU in the existing equivalent process. As can be seen from fig. 3, the option pricing calculation hardware accelerator implemented in this embodiment has significant advantages in terms of throughput, power consumption, and throughput/power consumption ratio compared to the monte carlo option pricing calculation implemented by the CPU and the GPU in the existing equivalent process.
In addition, an embodiment further provides a hardware accelerator card, which includes an accelerator card body and an accelerator chip disposed on the accelerator card body, where the accelerator chip is the option pricing calculation hardware accelerator.
In addition, the embodiment also provides a computer device, which includes a main board installed with a microprocessor and a memory connected with each other, and also includes the option pricing calculation hardware accelerator, where the microprocessor and the option pricing calculation hardware accelerator are connected in communication.
The option pricing calculation hardware accelerator is integrated on the mainboard, or the option pricing calculation hardware accelerator is installed on the mainboard in a board card mode in an inserting mode.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims (10)

1. An option pricing computation hardware accelerator comprising a first circuit unit for M time slices to complete a monte carlo simulation of an option execution price, SM, the first circuit unit comprising:
a Gaussian random number generator for generating a Gaussian random number z;
a multiplier M2, configured to multiply an input parameter sigqrdt by a gaussian random number z, where a calculation function expression of the parameter sigqrdt is sigma sqrt (T/M), where sigma is a preset option price fluctuation rate, T is a preset option validity period, and M is a preset number of time slices simulated by each monte carlo iteration;
an adder a2 for summing an input parameter drift and an output of the multiplier M2, wherein the calculated function expression of the parameter drift is (r-0.5 sigma) sigma (T/M), where r is a preset risk-free rate;
the EXP module is used for carrying out exponential operation on the output of the adder a 2;
a multiplexer MUXA for selecting the option initial prices S0, 1 and the multiplication result acc _ sm output by the multiplier M5, the multiplexer MUXA selects the option initial price S0 at the (dg + dm2+ da2+ de) th clock cycle after reset to wait for the first gaussian random number z0 to enter the multiplier M5, the multiplexer MUXA selects the option initial price S0 next cycle, then selects the constant 1 in the next 3 cycles, selects the multiplication result acc _ sm output by the multiplier M5 next M-4 cycles, and these three selection operations repeat I times to simulate all monte carlo simulation paths, where I is the preset number of monte carlo iterations; where dg denotes the number of delayed clock cycles of the gaussian random number generator, dm2 denotes the number of delayed clock cycles of multiplier m2, da2 denotes the number of delayed clock cycles of adder a2, and de denotes the number of delayed clock cycles of the EXP module;
the multiplier m5 is used for multiplying the output of the EXP module and the output of the multiplexer MUXA to obtain a multiplication result acc _ sm and outputting the multiplication result acc _ sm to the multiplication array;
and the multiplication array is used for multiplying 4 multiplication intermediate results which are generated after M time slice simulations of one random path are completed and distributed in a four-stage pipeline of the M5 multiplier, and outputting a final option price SM predicted by Monte Carlo iteration after 3+2dma beats.
2. The option pricing computation hardware accelerator of claim 1, wherein the multiplier array comprises:
the multiplier ma0 is used for multiplying the result obtained by registering the output of the multiplier m5 and the output of the multiplier m5 through a primary register;
the multiplier ma1 is used for multiplying the results obtained after the output of the multiplier m5 is respectively registered by a two-stage register and a three-stage register;
and the multiplier ma2 is used for multiplying the outputs of the multipliers ma0 and ma1 to obtain the final option price SM of the monte carlo iterative prediction.
3. The option pricing computation hardware accelerator of claim 2, wherein the multiplier array further comprises an output for an available signal valid, and the condition for generating the available signal valid of the multiplier array is that the (dg + dm2+ da2+ de + M +3+2dma) th clock cycle after reset is valid and then valid for one beat per M time slices, where dg represents the number of delay clock cycles of the gaussian random number generator, dm2 represents the number of delay clock cycles of multiplier M2, da2 represents the number of delay clock cycles of adder a2, de represents the number of delay clock cycles of the EXP module, M represents the number of time slices simulated per monte carlo iteration, the number of delay clock cycles of multiplier 0, multiplier 1 and multiplier ma2 are all a, and 2dma represents the number of delay clock cycles that is twice dma.
4. The option pricing computation hardware accelerator of claim 3, wherein the Gaussian random number generator comprises:
a uniformly distributed random number generation module for generating uniformly distributed random numbers URNs using a WELL19937 method;
Box-Muller for converting uniformly distributed random numbers URNs to gaussian random numbers z using Box-Muller method.
5. The option pricing calculation hardware accelerator of claim 4, wherein the pipeline output of the first circuit unit is further coupled to a second circuit unit for processing an external round robin data path for Monte Carlo option pricing calculations, the second circuit unit comprising:
a subtraction unit s0, configured to subtract the option price SM and the right price strike of the option predicted by the monte carlo iteration at this time;
a comparator for comparing 0 with the output of the subtraction unit s0 and outputting a control signal;
a multiplexer MUXB for selecting 0 or the output of the subtracting unit s0 as its output according to the output of the comparator complerisor, and the selection condition is that if the output of the subtracting unit s0 is greater than or equal to 0, the comparator complerisor outputs 1, the multiplexer MUXB selects the output of the subtracting unit s0 for output, otherwise the multiplexer MUXB selects 0 for output;
a delayer delay for generating an enable signal en according to an available signal valid of the multiplication array;
and the accumulator is used for accumulating the output of the multiplexer MUXB under the control of the enable signal en to obtain an accumulated value sum _ payoff of the option benefits predicted by all Monte Carlo simulation paths.
6. The option pricing calculation hardware accelerator according to claim 5, wherein the pipeline output end of the second circuit unit is further connected to a third circuit unit for generating a final option estimation price according to an accumulated value sum _ payoff of option proceeds predicted by all Monte Carlo simulation paths.
7. The option pricing computation hardware accelerator of claim 6, wherein the third circuit unit comprises:
the shift unit is used for shifting the accumulated value sum _ payoff of the option benefits predicted by all Monte Carlo simulation paths to realize division;
and the multiplier m7 is configured to multiply the output of the shift unit and the discount rate ert to obtain a final benefit payoff, where a calculation function expression of the discount rate ert is ert ═ exp (-r × T), r is a preset risk-free interest rate, and T is a preset option validity period.
8. A hardware accelerator card, comprising an accelerator card body and an accelerator chip arranged on the accelerator card body, wherein the accelerator chip is the option pricing calculation hardware accelerator according to any one of claims 1 to 7.
9. A computer device comprising a motherboard on which a microprocessor and a memory are mounted, wherein the microprocessor and the memory are connected to each other, and further comprising an option pricing calculation hardware accelerator according to any one of claims 1 to 7, wherein the microprocessor and the option pricing calculation hardware accelerator are connected in communication.
10. The computer device of claim 9, wherein the option pricing calculation hardware accelerator is integrated on a motherboard, or the option pricing calculation hardware accelerator is mounted on the motherboard in a board manner by plugging.
CN202110674306.5A 2021-06-17 2021-06-17 Option pricing calculation hardware accelerator, accelerator card and computer equipment Active CN113312024B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110674306.5A CN113312024B (en) 2021-06-17 2021-06-17 Option pricing calculation hardware accelerator, accelerator card and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110674306.5A CN113312024B (en) 2021-06-17 2021-06-17 Option pricing calculation hardware accelerator, accelerator card and computer equipment

Publications (2)

Publication Number Publication Date
CN113312024A true CN113312024A (en) 2021-08-27
CN113312024B CN113312024B (en) 2022-05-24

Family

ID=77379502

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110674306.5A Active CN113312024B (en) 2021-06-17 2021-06-17 Option pricing calculation hardware accelerator, accelerator card and computer equipment

Country Status (1)

Country Link
CN (1) CN113312024B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114187046A (en) * 2021-12-13 2022-03-15 上海金融期货信息技术有限公司 Programming method and system for increasing option price calculation speed

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070294157A1 (en) * 2006-06-19 2007-12-20 Exegy Incorporated Method and System for High Speed Options Pricing
CN102968744A (en) * 2012-11-23 2013-03-13 上海睿云信息技术有限公司 Computer system and method for calculating convertible-bond share option
US20140074512A1 (en) * 2012-09-07 2014-03-13 The Travelers Indemnity Company Systems and methods for vehicle rental insurance
CN109741185A (en) * 2019-01-07 2019-05-10 中国人民解放军国防科技大学 Option pricing hardware accelerator
CN110334309A (en) * 2019-05-10 2019-10-15 李升东 Option data analysing method and device
CN111507837A (en) * 2020-04-10 2020-08-07 浙江万里学院 Option value calculation system based on time fractional order option pricing model

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070294157A1 (en) * 2006-06-19 2007-12-20 Exegy Incorporated Method and System for High Speed Options Pricing
US20140074512A1 (en) * 2012-09-07 2014-03-13 The Travelers Indemnity Company Systems and methods for vehicle rental insurance
CN102968744A (en) * 2012-11-23 2013-03-13 上海睿云信息技术有限公司 Computer system and method for calculating convertible-bond share option
CN109741185A (en) * 2019-01-07 2019-05-10 中国人民解放军国防科技大学 Option pricing hardware accelerator
CN110334309A (en) * 2019-05-10 2019-10-15 李升东 Option data analysing method and device
CN111507837A (en) * 2020-04-10 2020-08-07 浙江万里学院 Option value calculation system based on time fractional order option pricing model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JONAS STENBREK HEGNER等: "Design of power efficient FPGA based hardware accelerators for financial applications", 《CONFERENCE:NORCHIP 2012》 *
今晚打佬虎: "蒙特卡洛估值几种不同的计算方法(Python)", 《HTTPS://BLOG.CSDN.NET/U014281392/ARTICLE/DETAILS/76285280》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114187046A (en) * 2021-12-13 2022-03-15 上海金融期货信息技术有限公司 Programming method and system for increasing option price calculation speed

Also Published As

Publication number Publication date
CN113312024B (en) 2022-05-24

Similar Documents

Publication Publication Date Title
Park et al. A multi-granularity power modeling methodology for embedded processors
CN115344237B (en) Data processing method combining Karatsuba and Montgomery modular multiplication
CN109144469B (en) Pipeline structure neural network matrix operation architecture and method
Sun et al. An I/O bandwidth-sensitive sparse matrix-vector multiplication engine on FPGAs
CN113312024B (en) Option pricing calculation hardware accelerator, accelerator card and computer equipment
Hettiarachchi et al. Integer vs. floating-point processing on modern FPGA technology
Zhang et al. Achieving full parallelism in LSTM via a unified accelerator design
Liu et al. A high-throughput subspace pursuit processor for ECG recovery in compressed sensing using square-root-free MGS QR decomposition
Yadav et al. Design and verification of 16 bit RISC processor using Vedic mathematics
David Low latency and division free Gauss–Jordan solver in floating point arithmetic
Ivanović et al. Signal adaptive system for time–frequency analysis
Zafar et al. Hardware architecture design and mapping of ‘Fast Inverse Square Root’algorithm
Reggiani et al. Mix-GEMM: An efficient HW-SW Architecture for Mixed-Precision Quantized Deep Neural Networks Inference on Edge Devices
CN110647309A (en) High-speed big bit width multiplier
Vu et al. Graphics processing unit optimizations for the dynamics of the HIRLAM weather forecast model
Myjak et al. A medium-grain reconfigurable architecture for DSP: VLSI design, benchmark mapping, and performance
Banks et al. FPGA implementation of pseudo random number generators for Monte Carlo methods in quantitative finance
Schwiegelshohn et al. A resampling method for parallel particle filter architectures
JP2009245381A (en) Product-sum operation circuit, its design device, and program
CN110489798B (en) Fine-grained efficient design method of resolver for active power distribution network real-time simulation
Nouri et al. Design and evaluation of correlation accelerator in IEEE-802.11 a/g receiver using a template-based coarse-grained reconfigurable array
Nepal et al. Fast multi-objective algorithmic design co-exploration for FPGA-based accelerators
Kumar et al. Implementation of an area efficient high throughput architecture for sparse matrix LU factorization
Frid et al. Performance estimation in heterogeneous MPSoC based on elementary operation cost
CN116228423A (en) Option pricing calculation accelerator, acceleration card and equipment based on Heston model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant