JP4950325B2

JP4950325B2 - Efficient parallel processing method of Monte Carlo method

Info

Publication number: JP4950325B2
Application number: JP2010152357A
Authority: JP
Inventors: 康高野; 隆一佐藤
Original assignee: Mizuho DL Financial Technology Co Ltd
Current assignee: Mizuho DL Financial Technology Co Ltd
Priority date: 2010-07-02
Filing date: 2010-07-02
Publication date: 2012-06-13
Anticipated expiration: 2030-07-02
Also published as: JP2012014591A

Description

本発明は、モンテカルロ法の並列処理手法に関し、特に、並列計算機を効率的に用いるモンテカルロ法の並列処理手法と、それを実行するための装置およびコンピュータプログラムに関する。 The present invention relates to a Monte Carlo parallel processing technique, and more particularly to a Monte Carlo parallel processing technique that efficiently uses a parallel computer, and an apparatus and a computer program for executing the parallel processing technique.

モンテカルロ法（Monte Carlo method）は、自然現象や社会現象をはじめとする確率的な現象の分析や、複雑な数値積分などに広く用いられる手法である。このモンテカルロ法は、解析的に解くことが難しい問題に対して特に威力を発揮する。自然現象および社会現象は、不確実性を有するものが大半を占めるため、モンテカルロ法の適用範囲は学術分野から実務にわたり極めて広い。さらに、モンテカルロ法のアルゴリズム（計算手順）は理解が容易であることもモンテカルロ法が広く普及している一因である。例えば、モンテカルロ法を産業上利用している具体例としては、流体力学の分野におけるシミュレーションや、金融機関におけるリスク量の算出や各種金融商品の価格算出などが挙げられる（例えば、特許文献１、非特許文献１参照）。 The Monte Carlo method is a technique widely used for analysis of probabilistic phenomena such as natural phenomena and social phenomena, and complex numerical integration. This Monte Carlo method is particularly effective for problems that are difficult to solve analytically. Since natural phenomena and social phenomena are mostly uncertain, the scope of application of the Monte Carlo method is very wide from the academic field to practice. Furthermore, the Monte Carlo algorithm (calculation procedure) is easy to understand. This is one reason why the Monte Carlo method is widely used. For example, specific examples of industrial use of the Monte Carlo method include simulation in the field of fluid dynamics, calculation of risk amounts in financial institutions, price calculation of various financial products, etc. Patent Document 1).

一方で、モンテカルロ法の計算精度を確保するためには試行回数を十分に取る必要があり、その結果、計算時間が増大するという短所をもつ。近年では、複数のプロセッサコアを搭載したコンピュータが続々と登場し、従来は単一のスレッドで実行していたところ、複数のプロセッサコアのそれぞれにおいてスレッドを立てて並列に処理することによって計算時間を短縮することが比較的容易になった。ここで、スレッドとは、プロセス内の処理の流れのことを指す。モンテカルロ法についても並列処理によって計算時間を短縮することができる。特に、モンテカルロ法のアルゴリズムは並列処理と親和性が非常に高く、モンテカルロ法の並列処理による高速化の要請は高まっている。 On the other hand, in order to ensure the calculation accuracy of the Monte Carlo method, it is necessary to take a sufficient number of trials. As a result, the calculation time increases. In recent years, computers equipped with a plurality of processor cores have appeared one after another, and in the past, execution was performed with a single thread. It became relatively easy to shorten. Here, the thread refers to the flow of processing within the process. For the Monte Carlo method, the computation time can be shortened by parallel processing. In particular, the algorithm of the Monte Carlo method has a very high affinity with parallel processing, and there is an increasing demand for speeding up by the parallel processing of the Monte Carlo method.

モンテカルロ法に並列処理を施すこと自体は従来から行われている。すなわち、モンテカルロ法の試行全体を複数のスレッドで分担して、１台のコンピュータあるいは１個のプロセッサコアにつき１個のスレッドを割り当てることにより、各コンピュータまたはプロセッサコアが独立して処理を行うことができる。例えば試行回数を１００００回として４台のコンピュータで並列処理する場合には、各コンピュータでそれぞれ２５００回の試行を行えばよい。 The parallel processing of the Monte Carlo method has been conventionally performed. That is, the entire trial of the Monte Carlo method is shared by a plurality of threads, and one computer or one processor core assigns one thread to each computer, so that each computer or processor core can perform processing independently. it can. For example, when the number of trials is 10,000 and parallel processing is performed by four computers, each computer may perform 2500 trials.

しかし、実用上は、次に挙げる２つの問題を克服しなければならない。一つは、計算結果の再現性の問題であり、もう一つは、乱数系列間に生じる相関の問題である。 However, in practice, the following two problems must be overcome. One is a problem of reproducibility of calculation results, and the other is a problem of correlation occurring between random number sequences.

一つ目の計算結果の再現性とは、同じ乱数シード、すなわち、同じ乱数系列を用いてモンテカルロ法による数値計算を繰り返し行った場合に、同じ計算結果が得られるという意味である。この再現性は、バックテストを行うときや計算条件を変更した際の影響を検証する際に欠かせないものである。一般に、プログラムを並列化すると処理の逐次性が失われるため、計算結果の再現性を確保するには乱数の生成法と使用法に関する工夫が必要となる。 The reproducibility of the first calculation result means that the same calculation result can be obtained when numerical calculation by the Monte Carlo method is repeated using the same random number seed, that is, the same random number sequence. This reproducibility is indispensable when performing a back test or verifying the effect of changing calculation conditions. In general, when a program is parallelized, the sequentiality of processing is lost. Therefore, in order to ensure the reproducibility of calculation results, it is necessary to devise a method for generating and using random numbers.

二つ目の乱数系列間に生じる相関とは、モンテカルロ法の並列処理にあたって、スレッド毎に用いる乱数系列の間に有意にゼロとは言えない相関が生じることを指す。このような相関が生じると計算結果の信頼性が低下するため、コンピュータあるいはプロセッサコアごとにそれぞれ別の乱数系列を生成する場合には、乱数列間の相関が非常に小さいものを用いる必要がある。その解決策としては、一つの乱数系列を用いて並列処理を行うか、あるいは、互いに相関が非常に小さい乱数系列を生成し単純に並列処理を行うかのいずれかが考えられる。後者についてのアルゴリズムは存在するが、モンテカルロ法を実行する以前に乱数系列を生成するための処理や乱数列の独立性に関する検証が必要となるなど、課題が多い。したがって、一つの乱数列を用い、かつ、計算結果の再現性を保持しつつ、並列処理を行うことが実用上は重要となる。 The correlation generated between the second random number sequences means that a correlation which cannot be said to be significantly zero is generated between the random number sequences used for each thread in the parallel processing of the Monte Carlo method. When such a correlation occurs, the reliability of the calculation result is lowered. Therefore, when generating a different random number sequence for each computer or processor core, it is necessary to use one having a very small correlation between random number sequences. . As a solution to this problem, either parallel processing is performed using one random number sequence, or random number sequences having a very small correlation with each other are generated and the parallel processing is simply performed. There is an algorithm for the latter, but there are many problems such as the need for processing for generating a random number sequence and verification of the independence of the random number sequence before executing the Monte Carlo method. Therefore, it is practically important to perform parallel processing using a single random number sequence and maintaining the reproducibility of the calculation result.

一つの乱数列を用い、計算結果の再現性を保持しつつ、並列処理を行う際には、モンテカルロ法実行時の乱数の生成方法が本質的に重要となる。モンテカルロ法の並列処理に関する既存の技術としては以下のものがある。 When performing parallel processing using a single random number sequence and maintaining reproducibility of calculation results, a random number generation method at the time of executing the Monte Carlo method is essentially important. The existing technologies related to the parallel processing of the Monte Carlo method include the following.

［従来技術１］
図１は、従来技術（以下、従来技術１とも呼ぶ。）として、４個のスレッドを有する並列処理システムを用いて並列処理を行う場合の各スレッドの試行番号の担当を示す図である。横軸は第１スレッドから第４スレッドを示しており、縦軸は時間ｔを示している。そして、符号Ａは、後続で行う試行に用いる乱数の生成処理を示している。符号Ｂは、各試行の実行処理を示している。符号Ｂの後に続くかっこ書きの数字は、試行番号である。符号Ｃは、後述する乱数捨象処理もしくは状態変数のスキップ処理を示している。 [Prior art 1]
FIG. 1 is a diagram showing the assignment of trial numbers for each thread when parallel processing is performed using a parallel processing system having four threads as the prior art (hereinafter also referred to as prior art 1). The horizontal axis indicates the first to fourth threads, and the vertical axis indicates time t. A symbol A indicates a process for generating a random number used in a subsequent trial. A symbol B indicates an execution process of each trial. The numbers in parentheses following the symbol B are trial numbers. Symbol C indicates random number discarding processing or state variable skip processing described later.

図１に示すように、従来技術１では各スレッドに対して試行回数を均等に配分する。具体的には、前述のように１００００回の試行を４個のスレッドで実行すると考えて、第１スレッドは第１番目から第２５００番目の試行を担当する。そして、第２スレッドは第２５０１番目から第５０００番目の試行を担当し、第３スレッドは第５００１番目から第７５００番目の試行を担当し、第４スレッドは第７５０１番目から第１００００番目の試行を担当する。すなわち、それぞれのスレッドは２５００回分の試行を実行する。 As shown in FIG. 1, in the prior art 1, the number of trials is equally distributed to each thread. Specifically, assuming that 10,000 trials are executed by four threads as described above, the first thread takes charge of the first to 2500th trials. The second thread is responsible for the 2501st to 5000th trials, the third thread is responsible for the 5001st to 7500th trials, and the fourth thread is responsible for the 7501th to 10000th trials. Handle. That is, each thread executes 2500 trials.

乱数生成部をスレッドごとにそれぞれ用意し、全ての乱数生成部に同一の乱数シード（乱数系列)を与える。各スレッドは、自己が担当する試行にとって不要な乱数を捨象した上で、各自が割り当てられた試行を行う。 A random number generator is prepared for each thread, and the same random number seed (random number sequence) is given to all random number generators. Each thread discards random numbers that are unnecessary for the trial that it is in charge of, and then performs the trial that it assigned.

例えば、１回の試行に１００個の乱数を用いるとすると、第１スレッドは、１００×２５００＝２５万個の乱数を生成して試行を行う。 For example, if 100 random numbers are used for one trial, the first thread generates 100 × 2500 = 250,000 random numbers and performs the trial.

第２スレッドも同様に２５万個の乱数を必要とする。しかし、第２スレッドは、５０万個の乱数を生成した上で、前半の２５万個を捨象し、後半の２５万個を試行に用いる必要がある。前半の２５万個の乱数は、第１スレッドが試行のために使用するものであり、第２スレッドは前半の２５万個の乱数を使用することができない。第２スレッドも前半の２５万個の乱数を使用してしまうと、第１スレッドと第２スレッドのいずれもが同じ乱数を用いて試行を行うこととなり、モンテカルロ法の結果の信頼性が低下してしまう。この乱数捨象処理を図１において符号Ｃで示している。 Similarly, the second thread requires 250,000 random numbers. However, the second thread needs to generate 500,000 random numbers, discard the 250,000 in the first half, and use 250,000 in the second half for the trial. The first half 250,000 random numbers are used by the first thread for trial, and the second thread cannot use the first half 250,000 random numbers. If the second thread uses 250,000 random numbers in the first half, both the first thread and the second thread will try using the same random number, and the reliability of the result of the Monte Carlo method will decrease. End up. This random number discarding process is indicated by a symbol C in FIG.

これは、擬似乱数が、ランダムな物理現象、例えばサイコロを振って出た目を利用して生成する物理乱数とは異なり、前回の乱数を算術式に代入して、新たな乱数を生成するという逐次的な性質を有するからである。したがって、第２スレッドにおいては、第１番目から第２５万番目の乱数を生成しないと、第２５万１番目以降の乱数を生成することができない。 This is different from physical random numbers generated by using random physical phenomena such as dice rolls, and pseudo random numbers are generated by substituting previous random numbers into arithmetic expressions to generate new random numbers. This is because of the sequential nature. Therefore, in the second thread, unless the 1st to 250,000th random numbers are generated, the 2251st and subsequent random numbers cannot be generated.

第３スレッドは、７５万個の乱数を生成する。そして、第１スレッドで使用する最初の２５万個の乱数と、第２スレッドで使用する次の２５万個の乱数とを足し合わせた５０万個の乱数を捨象した上で、第５０万１番目の乱数から始めて２５万個の乱数を用いて試行を行う。 The third thread generates 750,000 random numbers. Then, after the first 250,000 random numbers used in the first thread and the next 250,000 random numbers used in the second thread are added together, 500,000 random numbers are discarded, Starting with the 2nd random number, trial is performed using 250,000 random numbers.

最後に、第４スレッドは、１００万個の乱数を生成する。そして、第１スレッドで使用する最初の２５万個の乱数と、第２スレッドで使用する次の２５万個の乱数と、第３スレッドで使用するさらに次の２５万個の乱数とを足し合わせた７５万個の乱数を捨象した上で、第７５万１番目から始めて２５万個の乱数を用いて試行を行う。 Finally, the fourth thread generates 1 million random numbers. Then, add the first 250,000 random numbers used in the first thread, the next 250,000 random numbers used in the second thread, and the next 250,000 random numbers used in the third thread. After 750,000 random numbers are discarded, trial is performed using 250,000 random numbers starting from the 7501st.

ここで、第１スレッドは乱数の捨象処理を行わないことに留意されたい。 Here, it should be noted that the first thread does not perform random number discard processing.

これにより上記二つの問題を克服した並列処理が可能となる。ただし、この方法では、後続の試行を担当するスレッドほどより多くの乱数を捨象しなければならないため、試行回数は均等であるが、各スレッドの処理負荷は不均等となる。図１の例では、第１スレッドは乱数の捨象処理を行わないため、４つのスレッドの中で最も処理負荷が小さい。そして、第２スレッド、第３スレッド、第４スレッドの順に、生成する乱数の個数が増えるため、処理の負荷も同じ順番で大きくなる。すなわち、第４スレッドは、４個のスレッドの中で最も処理負荷が大きい。 This enables parallel processing that overcomes the above two problems. However, in this method, since more random numbers have to be discarded as the thread in charge of subsequent trials, the number of trials is equal, but the processing load of each thread is unequal. In the example of FIG. 1, since the first thread does not perform random number discard processing, the processing load is the smallest among the four threads. Since the number of random numbers to be generated increases in the order of the second thread, the third thread, and the fourth thread, the processing load also increases in the same order. That is, the fourth thread has the largest processing load among the four threads.

第４スレッドが第１００００番目の試行の試行を終えた時点で、モンテカルロ法の処理は終了する。第４スレッドが第１００００番目の試行を終えるまでに要する理論的な計算時間ｔ_ｐ１は、以下のように表現することができる。

ここで、Ｎ_Ｓｉｍはモンテカルロ法の試行回数であり、Ｎ_Ｔは並列処理を行うスレッドの数であり、ｔ_Ｓｉｍは１回の試行に要する時間であり、ｔ_Ｓｋｉｐは１回の試行に用いる乱数の捨象に要する時間である。 When the fourth thread finishes the 10,000th trial, the Monte Carlo method ends. The theoretical calculation time t _p1 required for the fourth thread to finish the 10,000th trial can be expressed as follows.

Here, N _Sim is the number of trials of the Monte Carlo method, _NT is the number of threads performing parallel processing, t _Sim is the time required for one trial, and t _Skip is a random number used for one trial. This is the time it takes to discard.

この処理方法では、スレッド数が多いほど処理負荷の不均等が如実に現れるため、スレッド数が増加するに従って並列処理の効率が悪化していく。 In this processing method, the greater the number of threads, the more uneven the processing load appears. Therefore, the efficiency of parallel processing deteriorates as the number of threads increases.

［従来技術２］
上記の方法を改善したものが特許文献１に記載されている。特許文献１によれば、金融工学の種々の計算におけるモンテカルロ法において、各スレッドが担当する試行回数を均等に割り当てるのではなく、後続の試行を担当するスレッドの試行回数を少なくすることにより、各スレッドの処理負荷の均一化を図るように改善されている。その様子を図２に示している。図１と同様に、符号Ａは、後続で行うモンテカルロ法に用いる乱数生成処理を示している。符号Ｂは、試行の実行処理を示している。符号Ｂの後に続くかっこ書きの数字は、試行番号である。符号Ｃは、乱数捨象処理を示している。 [Prior Art 2]
An improved version of the above method is described in Patent Document 1. According to Patent Document 1, in the Monte Carlo method in various calculations of financial engineering, instead of assigning the number of trials assigned to each thread equally, by reducing the number of trials of the thread responsible for subsequent trials, It has been improved to make the processing load of threads uniform. This is shown in FIG. As in FIG. 1, symbol A indicates a random number generation process used in the subsequent Monte Carlo method. A symbol B indicates trial execution processing. The numbers in parentheses following the symbol B are trial numbers. A symbol C indicates a random number discarding process.

図２に示すように、例えば、第１スレッドは第１番目から第３０００番目の試行を担当する。第２スレッドは第３００１番目から第５７５０番目の試行を担当する。第３スレッドは第５７５１番目から第８０００番目の試行を担当する。第４スレッドは第８００１番目から第１００００番目の試行を担当する。 As shown in FIG. 2, for example, the first thread is responsible for the first through 3000th trials. The second thread is responsible for the 3001st through 5750th trials. The third thread is responsible for the 5751st through 8000th trials. The fourth thread is responsible for the 8001st through 10000th trials.

つまり、第１スレッドは３０００回分の試行を担当し、第２スレッドは２７５０回分の試行を担当し、第３スレッドは２２５０回分の試行を担当し、第４スレッドは２０００回分の試行を担当する。 That is, the first thread is responsible for 3000 trials, the second thread is responsible for 2750 trials, the third thread is responsible for 2250 trials, and the fourth thread is responsible for 2000 trials.

このように、従来技術２では、乱数を多く生成する必要があるスレッドの試行担当回数を他のスレッドよりも減らすことにより、全体として、４個のスレッドの負荷の均等化を図っている。 As described above, in the related art 2, the load of four threads is equalized as a whole by reducing the number of trial assignments of threads that need to generate a large number of random numbers as compared with other threads.

この従来技術において、各プロセッサコアの処理負荷が均一になるように試行回数を割り当てた場合に試行の実行に要する理論的な計算時間ｔ_ｐ２は、以下のように表現することができる。

ここで、Ｎ_Ｓｉｍはモンテカルロ法の試行回数であり、Ｎ_Ｔは並列処理を行うスレッドの数であり、ｔ_Ｓｉｍは１回の試行に要する時間であり、ｔ_Ｓｋｉｐは１回の試行に用いる乱数の捨象に要する時間である。 In this prior art, when the number of trials is assigned so that the processing load of each processor core is uniform, the theoretical calculation time t _p2 required for the trial execution can be expressed as follows.

この技術にも次のような問題点がある。一つは、処理負荷が均一になるように各スレッドに最適な試行回数を割り当てるためには、ｔ_Ｓｉｍとｔ_Ｓｋｉｐの値を計測する必要があり、そのための追加的なプログラムが必要となる点である。もう一つは、依然として各スレッドで乱数の捨象処理を行う必要があるため、従来技術１と比べて改善はするものの、やはりスレッド数が増えるにつれて並列処理の効率が落ちるという点である。 This technology also has the following problems. _First , in order to assign the optimum number of trials to each thread so that the processing load is uniform, it is necessary to measure the values of t _Sim and t _Skip , and an additional program for that is required. It is. The other is that since it is still necessary to perform random number randomization processing in each thread, the efficiency of parallel processing decreases as the number of threads increases, although this is improved compared to the prior art 1.

なお、式（２）および式（３）において、スレッド数Ｎ_Ｔ→∞の極限をとると、いずれもＮ_Ｓｉｍｔ_Ｓｋｉｐとなる。すなわち、どんなにスレッド数を増やしても、Ｎ_Ｓｉｍｔ_Ｓｋｉｐの計算時間を要する。これは、乱数の生成は逐次的に行うものであって、モンテカルロ法全体を完全には並列化することができないからである。 In Expressions (2) and (3), if the limit of the number of threads N _T → ∞ is taken, both become N _Sim t _Skip . That is, no matter how many threads are increased, N _Sim t _Skip calculation time is required. This is because random number generation is performed sequentially and the entire Monte Carlo method cannot be completely parallelized.

特許４０３２３３９号公報Japanese Patent No. 4032339 石川達也、内田善彦、「モンテカルロ法によるプライシングとリスク量の算出について―正規乱数を用いる場合の適切な実装方法の考察―」、金融研究、日本銀行金融研究所、２００２年６月、第２１巻、別冊第１号、ｐ．５１−９０Tatsuya Ishikawa, Yoshihiko Uchida, “Pricing by Monte Carlo method and calculation of risk amount: Consideration of appropriate implementation method using normal random numbers”, Financial Research, Bank of Japan, Institute for Financial Research, June 2002, Volume 21 , Separate volume No. 1, p. 51-90

本発明は、従来技術で行っていた、各スレッドにおける無駄な乱数の捨象を行わず、かつ、処理負荷の均一化を図るために各スレッドが担当する試行回数を予め計算して決めておく必要のない、効率的な並列処理手法を提供することを目的とする。 In the present invention, it is necessary to calculate and determine in advance the number of trials each thread is in charge of in order to equalize the processing load without discarding useless random numbers in each thread, which was performed in the prior art. An object is to provide an efficient parallel processing method without any problem.

本発明は、コンピュータによるモンテカルロ法の実行を状態変数生成記憶部と複数のスレッドを用いて並列処理する並列処理方法であって、状態変数を状態変数生成記憶部に記憶するステップと、一のスレッドのプロセッサコアが、排他制御のもと前記一のスレッドのプロセッサコアのみがアクセスできる状態で、前記状態変数生成記憶部に記憶されている状態変数に対し、所定の回数にわたり順次スキップ処理と変換処理を行って、乱数を生成する乱数生成ステップと、前記一のスレッドのプロセッサコアが、前記生成した乱数を用いて前記一のスレッドに割り当てられた試行の演算を行う試行演算ステップとを含み、前記乱数生成ステップと試行演算ステップを他のスレッドにおいても並行して順次実行してゆく並列処理方法を提供する。 The present invention relates to a parallel processing method in which execution of a Monte Carlo method by a computer is performed in parallel using a state variable generation storage unit and a plurality of threads, the step of storing the state variable in the state variable generation storage unit, and one thread In a state in which only the processor core of the one thread can access under the exclusive control, the state variable stored in the state variable generation storage unit is sequentially skipped and converted for a predetermined number of times. And a random number generation step for generating a random number, and a trial calculation step in which the processor core of the one thread uses the generated random number to calculate a trial assigned to the one thread, and Provide a parallel processing method in which the random number generation step and the trial calculation step are sequentially executed in parallel in other threads.

本発明はまた、前記乱数生成ステップが、乱数を生成するために用いる状態変数が記憶されている状態変数生成記憶部に排他ロックがかかっているか否かを一のスレッドにおいてプロセッサコアが判断するステップと、前記状態変数生成記憶部に排他ロックがかかっていない場合には、前記一のスレッドのプロセッサコアが前記状態変数生成記憶部に排他ロックをかけるステップと、前記一のスレッドのプロセッサコアが、前記状態変数生成記憶部に記憶されている状態変数に対し、所定の回数にわたり順次スキップ処理と変換処理を行う処理ステップと、前記一のスレッドのプロセッサコアが前記状態変数生成記憶部に対する排他ロックを解除するステップとを含むものである前記並列処理方法を提供する。 In the present invention, the random number generation step includes a step in which a processor core determines in one thread whether or not an exclusive lock is applied to a state variable generation storage unit in which a state variable used to generate a random number is stored. When the exclusive lock is not applied to the state variable generation storage unit, the processor core of the one thread applies an exclusive lock to the state variable generation storage unit, and the processor core of the one thread includes: A processing step of sequentially performing a skip process and a conversion process for a state variable stored in the state variable generation storage unit a predetermined number of times, and a processor core of the one thread performs an exclusive lock on the state variable generation storage unit And releasing the parallel processing method.

さらに、本発明はコンピュータによるモンテカルロ法の実行を状態変数生成記憶部と複数のスレッドを用いて並列処理する並列処理方法であって、状態変数を状態変数生成記憶部に記憶するステップと、一のスレッドのプロセッサコアが、排他制御のもと前記一のスレッドのプロセッサコアのみがアクセスできる状態で、前記状態変数生成記憶部に記憶されている状態変数を取得するとともに、該状態変数に対し、所定の回数にわたり順次スキップ処理を行う状態変数スキップ処理ステップと、前記一のスレッドのプロセッサコアが、前記取得した状態変数に対し、所定の回数にわたり変換処理とスキップ処理を行って乱数を生成するステップと、前記一のスレッドのプロセッサコアが、前記生成した乱数を用いて前記一のスレッドに割り当てられた試行の演算を行う試行演算ステップとを含み、前記各ステップを他のスレッドにおいても並行して順次実行してゆく並列処理方法を提供する。 Furthermore, the present invention is a parallel processing method for executing parallel execution of a Monte Carlo method by a computer using a state variable generation storage unit and a plurality of threads, and storing the state variable in the state variable generation storage unit, The processor core of the thread acquires a state variable stored in the state variable generation storage unit in a state where only the processor core of the one thread is accessible under exclusive control, and the state variable A state variable skip processing step that sequentially performs skip processing over the number of times, and a step in which the processor core of the one thread performs conversion processing and skip processing on the acquired state variable a predetermined number of times to generate a random number; The processor core of the one thread is assigned to the one thread using the generated random number. And a trial calculation step of performing calculation was attempted, also provides a parallel processing method slide into sequentially executed in parallel in the other threads each step.

ここで、前記状態変数スキップ処理ステップは、乱数を生成するために用いる状態変数が記憶されている状態変数生成記憶部に排他ロックがかかっているか否かを一のスレッドにおいてプロセッサコアが判断するステップと、前記状態変数生成記憶部に排他ロックがかかっていない場合には、前記一のスレッドのプロセッサコアが前記状態変数生成記憶部に排他ロックをかけるステップと、前記一のスレッドのプロセッサコアが、前記状態変数生成記憶部から状態変数の値を取得するステップと、前記一のスレッドのプロセッサコアが、前記状態変数生成記憶部に記憶されている状態変数に対し、所定の回数にわたり順次スキップ処理を行う処理ステップと、前記一のスレッドのプロセッサコアが、前記状態変数生成記憶部に対する排他ロックを解除するステップとを含むものであってよい。 Here, in the state variable skip processing step, the processor core determines in one thread whether or not an exclusive lock is applied to a state variable generation storage unit in which a state variable used for generating a random number is stored. When the exclusive lock is not applied to the state variable generation storage unit, the processor core of the one thread applies an exclusive lock to the state variable generation storage unit, and the processor core of the one thread includes: Acquiring a value of the state variable from the state variable generation storage unit, and the processor core of the one thread sequentially skips the state variable stored in the state variable generation storage unit a predetermined number of times. The processing step to be performed and the processor core of the one thread have an exclusive lock on the state variable generation storage unit. The may be one comprising the step of releasing.

これらの並列処理方法は、金融工学の分野において、複数の債権を含むポートフォリオの信用リスク量を計測するため、金融市場の変化をモデル化して金融商品の価格を算出するため、さらには、金融市場の変動によって生ずる市場リスク量を計測するために、モンテカルロ法を実行するのに用いることができる。 These parallel processing methods measure the credit risk of a portfolio that includes multiple bonds in the field of financial engineering, calculate the price of financial products by modeling changes in the financial market, and further It can be used to perform the Monte Carlo method to measure the amount of market risk caused by fluctuations in

本発明は、これらの並列処理方法の各ステップをコンピュータに実行させるためのプログラムも提供する。 The present invention also provides a program for causing a computer to execute each step of these parallel processing methods.

さらに、本発明は、コンピュータによるモンテカルロ法を状態変数生成記憶部と複数のスレッドを用いて並列処理により実行する並列処理システムであって、前記複数のスレッドで共有される、乱数を生成するために用いる状態変数が記憶されている状態変数生成記憶部と、前記状態変数生成記憶部に記憶されている状態変数に対し、所定の回数にわたり順次変換処理とスキップ処理を行って乱数を生成する乱数生成部と、各スレッドについて、前記状態変数生成記憶部に排他ロックをかけるか、あるいは排他ロックを解除する排他ロック制御部と、前記乱数生成部で生成された乱数を用いて試行の演算を行う演算処理実行部とを備える並列処理システムを提供する。 Furthermore, the present invention is a parallel processing system that executes a Monte Carlo method by a computer by parallel processing using a state variable generation storage unit and a plurality of threads, in order to generate a random number that is shared by the plurality of threads. A state variable generation storage unit that stores state variables to be used, and a random number generation that sequentially performs conversion processing and skip processing for a predetermined number of times on the state variables stored in the state variable generation storage unit And an exclusive lock control unit that applies an exclusive lock to the state variable generation storage unit or releases an exclusive lock for each thread, and an operation that performs a trial operation using the random number generated by the random number generation unit A parallel processing system including a processing execution unit is provided.

そして、本発明は、コンピュータによるモンテカルロ法を状態変数生成記憶部と複数のスレッドを用いて並列処理することにより実行する並列処理システムであって、前記複数のスレッドで共有される、乱数を生成するために用いる状態変数が記憶されている状態変数生成記憶部と、前記状態変数生成記憶部に記憶されている状態変数に対し、所定の回数にわたり順次スキップ処理を行う状態変数スキップ処理部と、各スレッドについて、前記状態変数生成記憶部に排他ロックをかけるか、あるいは排他ロックを解除する排他ロック制御部と、前記状態変数生成記憶部から状態変数の値を取得してメモリに保存する状態変数取得部と、メモリに保存されている状態変数に対して、所定の回数にわたり変換処理とスキップ処理を行い、乱数を生成する乱数生成部と、前記乱数生成部で生成された乱数を用いて試行の演算を行う演算処理実行部とを備える並列処理システムをも提供する。 The present invention is a parallel processing system that executes a Monte Carlo method by a computer by performing parallel processing using a state variable generation storage unit and a plurality of threads, and generates a random number that is shared by the plurality of threads. A state variable generation storage unit in which state variables used for storage are stored, a state variable skip processing unit that sequentially performs skip processing over a predetermined number of times for the state variables stored in the state variable generation storage unit, For a thread, an exclusive lock control unit that applies an exclusive lock to the state variable generation storage unit or releases an exclusive lock, and a state variable acquisition that acquires the value of the state variable from the state variable generation storage unit and stores it in the memory Parts and state variables stored in memory are converted and skipped a predetermined number of times to generate random numbers. A random number generator which also provides a parallel processing system comprising a processing execution unit for performing computation of the trial using the random number generated by the random number generating unit.

本発明の一実施形態によれば、モンテカルロ法を用いて金融工学の分野における信用リスク量の計測を行うことができる。すなわち、個々の債務者（通常、企業）の価値の変化をモデル化して、そのポートフォリオまたは銀行の信用リスク量（信用ＶａＲ、条件付き信用ＶａＲなど）を計測する場合に、本発明の方法を用いてモンテカルロシミュレーションを実行することにより、全体的な信用リスク量を算出することができる。 According to one embodiment of the present invention, the amount of credit risk in the field of financial engineering can be measured using the Monte Carlo method. That is, the method of the present invention is used when modeling changes in the value of individual obligors (usually companies) and measuring the credit risk amount (credit VaR, conditional credit VaR, etc.) of the portfolio or bank. By executing the Monte Carlo simulation, the overall credit risk amount can be calculated.

また、同様にして本発明のモンテカルロ法を用いて金融市場における市場リスク量の計測を行うことができる。具体的には、株式市場における株価の変化や債券市場における債券価格の変化をモデル化して、本発明の方法を用いてモンテカルロシミュレーションを実行することにより、これらの市場の変動によって生ずる市場リスク量を計測することができる。 Similarly, the amount of market risk in the financial market can be measured using the Monte Carlo method of the present invention. Specifically, by modeling changes in stock prices in the stock market and bond prices in the bond market, and executing a Monte Carlo simulation using the method of the present invention, the amount of market risk caused by these market fluctuations can be calculated. It can be measured.

あるいは、本発明のモンテカルロ法を用いて金融商品の価格算出を行うことができる。すなわち、金融市場の変化をモデル化して、本発明の方法を用いてモンテカルロシミュレーションを実行することにより、各種有価証券やデリバティブなどの金融商品の価格を算出することができる。 Alternatively, the price of the financial product can be calculated using the Monte Carlo method of the present invention. That is, by modeling the change in the financial market and executing Monte Carlo simulation using the method of the present invention, it is possible to calculate the price of financial products such as various securities and derivatives.

本発明では、ほぼ理想的な並列化効率を実現したモンテカルロ法の並列処理が実現される。また、本発明は実装が容易であるという特長を持つ。本発明は、一般のモンテカルロ法についての技術であって、適用分野を問わないため、上述のように、本発明によりモンテカルロ法の計算負荷の軽減のメリットを享受できる分野は極めて広いと考えられる。金融工学の各分野のほか、流体力学や分子動力学などの種々の分野において有用なものである。また、従来技術の上記二つの欠点、すなわち、無駄な乱数の捨象によりスレッド数が増えるに従って並列化の効率が落ちるという欠点と、各スレッドに割り当てる試行回数をあらかじめ計算しなければならないという欠点とをいずれも解消でき、より一層効率的な並列処理が可能となる。 In the present invention, parallel processing of the Monte Carlo method that realizes almost ideal parallel efficiency is realized. Further, the present invention has a feature that mounting is easy. Since the present invention is a technique related to a general Monte Carlo method and can be applied to any field, as described above, it is considered that the field in which the present invention can enjoy the advantages of reducing the calculation load of the Monte Carlo method is very wide. In addition to each field of financial engineering, it is useful in various fields such as fluid dynamics and molecular dynamics. In addition, the above two disadvantages of the prior art, namely, the disadvantage that the efficiency of parallelization decreases as the number of threads increases due to the discarding of useless random numbers, and the disadvantage that the number of trials assigned to each thread must be calculated in advance. Both can be eliminated, and more efficient parallel processing becomes possible.

従来技術に基づいて、各スレッドに試行回数を割り当てる様子を示す模式図である。It is a schematic diagram which shows a mode that the frequency | count of trial is allocated to each thread | sled based on a prior art. 従来技術に基づいて、各スレッドに試行回数を割り当てる様子を示す模式図である。It is a schematic diagram which shows a mode that the frequency | count of trial is allocated to each thread | sled based on a prior art. 乱数を生成する処理の流れを示す模式図である。It is a schematic diagram which shows the flow of the process which produces | generates a random number. モンテカルロ法の並列処理を行う並列処理システムのブロック図である。It is a block diagram of the parallel processing system which performs the parallel processing of a Monte Carlo method. モンテカルロ法の並列処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the parallel processing of a Monte Carlo method. 各スレッドに試行回数を割り当てる様子を示す模式図である。It is a schematic diagram which shows a mode that the frequency | count of trial is allocated to each thread | sled. 従来技術１との計算時間の比較を示す表である。It is a table | surface which shows the comparison of the calculation time with the prior art 1. FIG. 従来技術２との計算時間の比較を示す表である。It is a table | surface which shows the comparison of the calculation time with the prior art 2. FIG. モンテカルロ法を逐次処理することにより、半径０．５の円の面積を求めるプログラム例である。This is an example of a program for obtaining the area of a circle with a radius of 0.5 by sequentially processing the Monte Carlo method. モンテカルロ法を並列処理することにより、半径０．５の円の面積を求めるプログラム例である。This is an example of a program for calculating the area of a circle with a radius of 0.5 by parallel processing of the Monte Carlo method. モンテカルロ法を並列処理することにより、半径０．５の円の面積を求めるプログラム例である。This is an example of a program for calculating the area of a circle with a radius of 0.5 by parallel processing of the Monte Carlo method. 乱数生成クラスのプログラム例である。It is an example program of a random number generation class.

まず、乱数生成処理について述べる。図３は、乱数を生成する処理の流れを示す模式図である。乱数を生成するためには、最初にシード（乱数種）と呼ばれる、擬似乱数を生成する元となる値を用意する。このシードを符号１０１で示している。シード１０１は、キーボードを通して入力するか、あるいは、専用のプログラムを用いて準備することもできる。 First, random number generation processing will be described. FIG. 3 is a schematic diagram showing a flow of processing for generating a random number. In order to generate a random number, first, a value which is called a seed (random number seed) and is a source for generating a pseudo-random number is prepared. This seed is denoted by reference numeral 101. The seed 101 can be input through a keyboard or can be prepared using a dedicated program.

そして、ステップＳ１として、用意されたシード１０１から状態変数１０２を計算する。状態変数とは、一般には整数を要素とするベクトルである。次にステップＳ２として、初期状態変数１０２から乱数１０３を生成する。ステップＳ３では、初期状態変数１０２から新たな状態変数１０４を計算する。ステップＳ４では、状態変数１０４から乱数１０５を生成する。ステップＳ５では、状態変数１０４から新たな状態変数１０６を計算する。ステップＳ６では、状態変数１０６から乱数１０７を生成する。 In step S1, the state variable 102 is calculated from the prepared seed 101. A state variable is generally a vector whose elements are integers. Next, as step S2, a random number 103 is generated from the initial state variable 102. In step S3, a new state variable 104 is calculated from the initial state variable 102. In step S4, a random number 105 is generated from the state variable 104. In step S5, a new state variable 106 is calculated from the state variable 104. In step S6, a random number 107 is generated from the state variable 106.

あるいは、ステップＳ１、Ｓ３、Ｓ５、Ｓ７、・・・を実行して、状態変数を必要な乱数個分生成する。その上で、ステップＳ２、Ｓ４、Ｓ６、・・・を実行して乱数を生成することもできる。 Alternatively, steps S1, S3, S5, S7,... Are executed to generate necessary random numbers of state variables. Then, steps S2, S4, S6,... Can be executed to generate random numbers.

このようにして、状態変数の計算と乱数の生成とを繰り返すことにより、複数の乱数を生成することができる。この乱数生成方法の具体例として、メルセンヌ・ツイスター（Mersenne Twister）法や線形合同法を挙げることができる。線形合同法は、状態変数そのものが乱数であると考えることにより、上述の乱数生成処理の枠組みで捉えることが可能である。 In this way, a plurality of random numbers can be generated by repeating the calculation of state variables and the generation of random numbers. Specific examples of the random number generation method include the Mersenne Twister method and the linear congruence method. The linear congruence method can be grasped in the framework of the random number generation process described above by considering that the state variable itself is a random number.

ステップＳ３、Ｓ５、Ｓ７のように、ある状態変数から新たな状態変数を計算する処理を状態変数のスキップ処理と呼ぶ。このように、状態変数に対してスキップ処理を行うと、状態変数の値は更新される。 A process of calculating a new state variable from a certain state variable as in steps S3, S5, and S7 is referred to as a state variable skip process. As described above, when the skip process is performed on the state variable, the value of the state variable is updated.

上述したように、乱数の生成は、状態変数のスキップ処理と、状態変数を乱数に変換する変換処理とに分けることができる。また、各スキップ処理の後すぐに乱数の生成を実行することもできる。本発明は、排他制御を利用しつつ、状態変数の値を全てのスレッドで共有することにより、従来技術における無駄な乱数の捨象処理が不要となるという知見に基づいている。 As described above, the generation of random numbers can be divided into a state variable skip process and a conversion process that converts state variables into random numbers. It is also possible to generate a random number immediately after each skip process. The present invention is based on the knowledge that, by using exclusive control and sharing the value of a state variable among all threads, unnecessary random number round-off processing in the prior art becomes unnecessary.

本発明を実現する一つの方法として、次のような実施形態が考えられる。すなわち、状態変数の情報を保持することができる乱数生成モジュールを１個用意し、全スレッドで共有する。このモジュールおよび、並列タスク・スケジューリングを以下のように組み合わせることにより、無駄な捨象のない乱数の割り当てを実現する。 As one method for realizing the present invention, the following embodiment can be considered. That is, one random number generation module capable of holding state variable information is prepared and shared by all threads. By combining this module and parallel task scheduling as described below, random number assignment without wasteful separation is realized.

例えば、ある試行において、第１スレッドが乱数生成処理に入る場合を考える。このとき、排他制御の下、第１スレッドが、前記モジュールにおいて順次変換処理とスキップ処理を行い、１回の試行で必要な乱数を生成してスレッド内のローカルメモリに保存する。この処理により、前記モジュールには最後のスキップ処理によって得られた状態変数が保存された状態となる。その後、排他制御が解除され、第１スレッドは上記ローカルメモリに保存された乱数を元に、モンテカルロ法の１回の試行を実施する。次に、第２スレッドは、同じく排他制御の下、前記モジュールにおいて１回の試行で必要な乱数のスキップ処理と変換処理を行うことになる。このようにして、その他のスレッドについても同様にして処理が進行していく。 For example, consider a case where the first thread enters the random number generation process in a certain trial. At this time, under exclusive control, the first thread sequentially performs conversion processing and skip processing in the module, generates a random number necessary for one trial, and stores it in the local memory in the thread. By this process, the state variable obtained by the last skip process is stored in the module. Thereafter, the exclusive control is released, and the first thread performs one trial of the Monte Carlo method based on the random number stored in the local memory. Next, the second thread performs a random number skip process and a conversion process required in one trial in the module under the same exclusive control. In this way, the processing proceeds in the same manner for other threads.

上記の実施形態では、１回の試行に使用する乱数が多くなるにつれ、各スレッドで使用するローカルメモリの容量も多くなり、また、乱数の保存および呼び出しに伴うメモリアクセスの回数も増える。このため、１回の試行に使用する乱数の数が多い場合には、メモリへのアクセスに要する時間が無視できなくなり、処理時間が増大する可能性がある。メモリの使用量を最小限に留めることにより、上記の処理時間増大の可能性を回避する方法を、次の実施形態で示す。 In the above embodiment, as the number of random numbers used for one trial increases, the capacity of local memory used in each thread increases, and the number of memory accesses associated with storing and calling random numbers also increases. For this reason, when the number of random numbers used for one trial is large, the time required to access the memory cannot be ignored, and the processing time may increase. A method for avoiding the possibility of increasing the processing time described above by minimizing the memory usage will be described in the following embodiment.

本発明を実現する他の一つの方法として、次のような実施形態が考えられる。すなわち、スキップ処理と変換処理のそれぞれの目的に応じて、二つのメモリ保持形態を有するモジュールを用意する。その上で、状態変数の保存とスキップ処理を行うためのモジュールを１つ用意し（以下、モジュールｍ_０とする)、全スレッドで共有する。さらに、状態変数に変換処理とスキップ処理を施して乱数を生成するためのモジュールをスレッドごとに用意する（以下、ｍ_１，ｍ_２，・・・，ｍ_ＮＴとする）。 As another method for realizing the present invention, the following embodiment can be considered. That is, modules having two memory holding modes are prepared according to the purposes of skip processing and conversion processing. On top of that, to prepare one module for saving the skip process state variables (hereinafter referred to as module m _0), shared by all threads. Further, a module for generating a random number by performing conversion processing and skip processing on the state variable is prepared for each thread (hereinafter referred to as m ₁ , m ₂ ,..., M _NT ).

これらの（Ｎ_Ｔ＋１）個のモジュールおよび、並列タスク・スケジューリングを以下のように組み合わせることにより、無駄な捨象がない乱数の割り当てを実現する。例えば、ある試行において、第１スレッドが乱数生成処理に入る場合を考える。このとき排他制御の下、第１スレッドがモジュールｍ_０に保存されている状態変数をモジュールｍ_１にコピーする。さらに、１回の試行で必要な乱数分のスキップ処理をモジュールｍ_０において行う。この処理により、モジュールｍ_０には、最後のスキップ処理によって得られた状態変数が保存された状態となる。その後、排他制御が解除され、第１スレッドはモジュールｍ_１を用いてローカルに変換処理とスキップ処理を行い、試行の実行に必要な乱数を生成する。次に、第２スレッドは、同じく排他制御の下、モジュールｍ_０に保存されている状態変数の情報をモジュールｍ_２へコピーしたうえで、モジュールｍ_０のスキップ処理を行うことになる。このようにして、その他のスレッドについても同様にして処理が進行していく。 By combining these (N _T +1) modules and parallel task scheduling as follows, random number assignment without wasteful retirement is realized. For example, consider a case where the first thread enters the random number generation process in a certain trial. At this time, under exclusive control, the first thread copies the state variable stored in the module m ₀ to the module m ₁ . Further, a skip process for a random number necessary for one trial is performed in the module m ₀ . By this process, the module m ₀ is in a state where the state variable obtained by the last skip process is stored. Thereafter, the exclusive control is released, and the first thread performs conversion processing and skip processing locally using the module m ₁ , and generates a random number necessary for executing the trial. Next, under the same exclusive control, the second thread copies the state variable information stored in the module m ₀ to the module m ₂ and then performs the skip process of the module m ₀ . In this way, the processing proceeds in the same manner for other threads.

上記の一連の操作でも、無駄な乱数の捨象は発生しない。また、各試行にはどのスレッドが任意の順番で担当しても同じ乱数が割り当てられるため、計算結果の再現性が得られ、各スレッドに担当させる試行の回数をあらかじめ決めておく必要はない。 Even in the series of operations described above, useless random numbers are not discarded. In addition, since the same random number is assigned to each trial regardless of which thread is assigned in any order, the reproducibility of the calculation result is obtained, and it is not necessary to determine the number of trials assigned to each thread in advance.

上記の一連の操作においては、各スレッドのローカルメモリ上にコピーされるのはモジュールｍ_０に保存されている状態変数だけであり、最初に示した実施形態に比べて、メモリへのアクセスを減らすことができる。一方で、各スレッドは状態変数のスキップ処理をモジュールｍ_０のほか、各スレッドで行われる乱数生成時にも行う必要がある。このため、どちらの実施形態の処理時間が速いかは、ハードウエアや分析の対象に依存する。 In the above series of operations, only the state variables stored in the module m ₀ are copied onto the local memory of each thread, and access to the memory is reduced compared to the first embodiment. be able to. On the other hand, it is necessary for each thread to perform a state variable skip process at the time of random number generation performed by each thread in addition to the module m ₀ . For this reason, which embodiment has the faster processing time depends on the hardware and the object of analysis.

なお、上記の２つの実施形態からも分かるように、本発明では状態変数のスキップ処理に対する排他制御を行う必要がある。排他制御を実現する方法には様々なものが知られているが、本発明は特定の排他制御の方法を前提としたものではない。各スレッドに共有された状態変数のスキップ処理が複数のスレッドで同時に行われることを回避できさえすれば、どのような排他制御方法を用いても実現可能である。この排他制御は、例えば、同期オブジェクトと排他ロックという概念を用いて実行することができる。排他制御が必要となる部分、すなわちある特定のセッションに入る前に、あるオブジェクトに鍵をかける。鍵がかかっている間、他のスレッドは同じオブジェクトに鍵をかけることできず、鍵が外されるまで待たされる。そして、鍵をかけたスレッドはその特定のセッションを終えた後にそのオブジェクトに対する鍵を外す。このとき、鍵をかける対象となるオブジェクトのことを同期オブジェクトと、鍵をかける操作のことを排他ロックと呼ぶ。この同期オブジェクトは、以下に説明する状態変数生成記憶部としてとらえることもできるし、それ以外の対象あるいは抽象的なデータとしてとらえることもできる。 As can be seen from the above two embodiments, in the present invention, it is necessary to perform exclusive control for the state variable skip processing. Various methods for realizing exclusive control are known, but the present invention is not based on a specific exclusive control method. Any exclusive control method can be used as long as the skip processing of the state variable shared by each thread can be avoided from being simultaneously performed by a plurality of threads. This exclusive control can be executed using, for example, the concept of a synchronization object and an exclusive lock. Before entering a part where exclusive control is required, that is, a certain session, a certain object is locked. While the key is locked, other threads cannot lock the same object and wait until the key is removed. The locked thread then unlocks the object after finishing that particular session. At this time, the object to be locked is called a synchronization object, and the lock operation is called an exclusive lock. This synchronization object can be regarded as a state variable generation storage unit described below, or can be regarded as other target or abstract data.

図４は、ある一実施の形態に基づいてモンテカルロ法を並列に実行する並列処理システム４００のブロック図である。並列処理システム４００は、キーボードおよびマウスを含む入力装置４１０と、ＬＣＤ（Liquid Crystal Display）ディスプレイなどの表示装置４２０とを有している。 FIG. 4 is a block diagram of a parallel processing system 400 that executes the Monte Carlo method in parallel according to one embodiment. The parallel processing system 400 includes an input device 410 including a keyboard and a mouse, and a display device 420 such as an LCD (Liquid Crystal Display) display.

並列処理システム４００は、さらに、並列処理制御部４３０と、第１スレッド４４０と、第２スレッド４５０と、第３スレッド４６０と、第４スレッド４７０と、グローバル記憶部４８０とを有している。この第１から第４のスレッド４４０、４５０、４６０、４７０は、４個以上のプロセッサコアを有するコンピュータを用いて実現することもできるし、４台以上のコンピュータを用いて実現することができる。また、並列処理システム４００自体を、４個以上のプロセッサを有するマルチプロセッサコンピュータを含むコンピュータシステムを用いて実現することもできる。 The parallel processing system 400 further includes a parallel processing control unit 430, a first thread 440, a second thread 450, a third thread 460, a fourth thread 470, and a global storage unit 480. The first to fourth threads 440, 450, 460, and 470 can be realized using a computer having four or more processor cores, or can be realized using four or more computers. In addition, the parallel processing system 400 itself can be realized by using a computer system including a multiprocessor computer having four or more processors.

グローバル記憶部４８０は、状態変数を記憶する状態変数生成記憶部４８１と状態変数スキップ処理部４８２とを有している。状態変数生成記憶部４８１および状態変数スキップ処理部４８２には、第１スレッドから第４スレッドの各スレッドのいずれもがアクセスして状態変数の値の取得やスキップ処理を行うことができる。 The global storage unit 480 includes a state variable generation storage unit 481 that stores state variables and a state variable skip processing unit 482. The state variable generation storage unit 481 and the state variable skip processing unit 482 can be accessed by any of the first thread to the fourth thread to acquire the value of the state variable and perform a skip process.

並列処理制御部４３０は、入力装置４１０からの指示を受けて、状態変数生成記憶部４８１と、状態変数スキップ処理部４８２と、第１スレッド４４０と、第２スレッド４５０と、第３スレッド４６０と、第４スレッド４７０とに対し、モンテカルロ法を並列に処理するよう指示を送る。各スレッドは、一つのプロセッサコアが担当して実行することができる。 Upon receiving an instruction from the input device 410, the parallel processing control unit 430 receives a state variable generation storage unit 481, a state variable skip processing unit 482, a first thread 440, a second thread 450, and a third thread 460. The fourth thread 470 is instructed to process the Monte Carlo method in parallel. Each thread can be executed by one processor core.

このとき、並列処理制御部４３０は、入力装置４１０から受け取ったシードを用いて状態変数を生成し、状態変数生成記憶部４８１に保存する。あるいは、並列処理制御部４３０が自ら専用のプログラムを用いてシードを生成し、そのシードから状態変数を生成して状態変数生成記憶部４８１に保存することができる。 At this time, the parallel processing control unit 430 generates a state variable using the seed received from the input device 410 and stores it in the state variable generation storage unit 481. Alternatively, the parallel processing control unit 430 can generate a seed using its own program, generate a state variable from the seed, and store it in the state variable generation storage unit 481.

第１スレッド４４０と、第２スレッド４５０と、第３スレッド４６０と、第４スレッド４７０とは、並列処理制御部４３０による制御を受けて、モンテカルロ法を並列に実行する。本実施形態の並列処理システム４００は４個のスレッドを有しているが、これに限定されない。スレッド数は任意の整数とすることができ、ハードウエアの構成に応じて変更することもできる。 The first thread 440, the second thread 450, the third thread 460, and the fourth thread 470 receive control of the parallel processing control unit 430 and execute the Monte Carlo method in parallel. Although the parallel processing system 400 of this embodiment has four threads, it is not limited to this. The number of threads can be an arbitrary integer, and can be changed according to the hardware configuration.

第１から第４のスレッド４４０、４５０、４６０、４７０は、いずれも同じ構造をとっている。例えば、第１スレッドは、試行回数計数部４４１と、排他ロック制御部４４２と、状態変数取得部４４３と、乱数生成部４４５と、試行実行部４４６と、ローカル記憶部４４７とを有している。 The first to fourth threads 440, 450, 460, 470 all have the same structure. For example, the first thread includes a trial count counting unit 441, an exclusive lock control unit 442, a state variable acquisition unit 443, a random number generation unit 445, a trial execution unit 446, and a local storage unit 447. .

並列処理システム４００は、結果出力部４９０も有している。結果出力部４９０は、４個のスレッド４４０、４５０、４６０、４７０でそれぞれ実行された試行により得られた結果をまとめて、表示装置４２０へ出力することができる。 The parallel processing system 400 also has a result output unit 490. The result output unit 490 can collectively output the results obtained by the trials executed by the four threads 440, 450, 460, and 470 to the display device 420.

図５は、図４に示した第１から第４のスレッド４４０、４５０、４６０、４７０がそれぞれ並列に行う処理のフローチャートである。一例として、試行回数Ｎ_ｓｉｍを１００００回（Ｎ_ｓｉｍ＝１００００）とする。 FIG. 5 is a flowchart of processing performed in parallel by the first to fourth threads 440, 450, 460, and 470 shown in FIG. As an example, the number of trials N _{sim is set} to 10,000 times (N _sim = 10000).

以下、第１スレッド４４０を例にとって処理の流れを説明する。まず、ステップＳ５０１で処理を開始する。ステップＳ５０２において、試行回数計数部４４１が、第何番目の試行であるかを表す変数ｎと、試行回数Ｎ_ｓｉｍとの大小関係を判断する。なお、変数nの値は、処理の開始時点で０に初期化されているものとする。 Hereinafter, the flow of processing will be described using the first thread 440 as an example. First, processing is started in step S501. In step S502, the trial number counting unit 441 determines the magnitude relationship between the variable n indicating the number of trials and the number of trials N _sim . Note that the value of the variable n is initialized to 0 at the start of processing.

ステップＳ５０２において、試行回数計数部４４１がｎ＜Ｎ_ｓｉｍと判断した場合には、ステップＳ５０３に進む。他方、試行回数計数部４４１が、ｎ＜Ｎ_ｓｉｍと判断しなかった場合には、ステップＳ５１１に進み、処理を終了する。 In step S502, if the trial number counting unit 441 determines that n <N _sim , the process proceeds to step S503. On the other hand, if the trial number counting unit 441 does not determine that n <N _sim , the process proceeds to step S511 and the process ends.

ステップＳ５０３では、試行回数計数部４４１がｎをインクリメントする。すなわち、ｎ＋１を計算して得られた結果を新たなｎの値とする。このとき得られた新たなｎの値が、これから実行する試行の番号である。 In step S503, the trial number counting unit 441 increments n. That is, the result obtained by calculating n + 1 is set as a new value of n. The new value of n obtained at this time is the number of trials to be executed.

ステップＳ５０４では、排他ロック制御部４４２が、状態変数生成記憶部４８１へのアクセスを試みる。第２から第４のスレッドの排他ロック制御部４５２、４６２、４７２のいずれもが状態変数生成記憶部４８１に対して排他ロックをかけていない場合には、第１スレッド４４０が状態変数生成記憶部４８１へアクセスすることができ、次いでステップＳ５０５に進む。 In step S504, the exclusive lock control unit 442 attempts to access the state variable generation storage unit 481. When none of the exclusive lock control units 452, 462, and 472 of the second to fourth threads has an exclusive lock on the state variable generation storage unit 481, the first thread 440 is in the state variable generation storage unit. 481 can be accessed, and then the process proceeds to step S505.

他方、ステップＳ５０４の段階で、第２から第４のスレッドの排他ロック制御部４５２、４６２、４７２のいずれか、例えば第４スレッド内の排他ロック制御部４７２が状態変数生成記憶部４８１に対して排他ロックをかけている場合には、第１から第３のスレッド４４０、４５０、４６０は状態変数生成記憶部４８１へアクセスすることができない。その場合は、ステップＳ５０４をある一定の時間をおいてから再度実行する。 On the other hand, at step S504, one of the exclusive lock control units 452, 462, and 472 of the second to fourth threads, for example, the exclusive lock control unit 472 in the fourth thread is in response to the state variable generation storage unit 481. When the exclusive lock is applied, the first to third threads 440, 450, and 460 cannot access the state variable generation storage unit 481. In that case, step S504 is executed again after a certain time.

ステップＳ５０５では、第１スレッド４４０の排他ロック制御部４４２が状態変数生成記憶部４８１に対して排他ロックをかける。その結果、第２、第３、第４のスレッド４５０、４６０、４７０は、いずれも状態変数生成記憶部４８１に対してアクセスすることができなくなる。 In step S505, the exclusive lock control unit 442 of the first thread 440 places an exclusive lock on the state variable generation storage unit 481. As a result, none of the second, third, and fourth threads 450, 460, and 470 can access the state variable generation storage unit 481.

状態変数生成記憶部４８１に対するアクセスの排他制御を行わないとすると、同時に複数のスレッドがアクセスでき、複数のスレッドが同じ状態変数を用いて乱数を生成することとなる。この場合、異なるスレッドで使用する乱数の一部に重複が生じ、得られるモンテカルロ法の結果も信頼性に欠けるものとなる。このように、同じ状態変数を複数のスレッドが重複して用いないようにするために、上述した排他的なアクセス制御が有効である。 If exclusive control of access to the state variable generation storage unit 481 is not performed, a plurality of threads can access at the same time, and the plurality of threads generate random numbers using the same state variable. In this case, some random numbers used in different threads are duplicated, and the obtained Monte Carlo method results are also unreliable. In this way, the exclusive access control described above is effective to prevent a plurality of threads from using the same state variable redundantly.

次にステップＳ５０６では、第１スレッドの状態変数取得部４４３が状態変数生成記憶部４８１にアクセスし、状態変数生成記憶部４８１に保存されている状態変数の値を取得して、ローカル記憶部４４７に保存する。 In step S506, the state variable acquisition unit 443 of the first thread accesses the state variable generation storage unit 481, acquires the value of the state variable stored in the state variable generation storage unit 481, and the local storage unit 447. Save to.

次にステップＳ５０７では、状態変数スキップ処理部４８２が状態変数生成記憶部４８１に保存されている状態変数に対しスキップ処理を実行する。具体的には、状態変数スキップ処理部４８２が、図３に示したステップＳ３、Ｓ５、Ｓ７、・・・を繰り返して実行する。このとき、繰り返す回数は、１回のシミュレーションを行うのに必要な乱数の個数と同じ値である。 In step S507, the state variable skip processing unit 482 executes a skip process on the state variables stored in the state variable generation storage unit 481. Specifically, the state variable skip processing unit 482 repeatedly executes steps S3, S5, S7,... Shown in FIG. At this time, the number of repetitions is the same value as the number of random numbers required to perform one simulation.

例えば、１回の試行に１００個の乱数を要する場合は、ステップＳ５０７で、状態変数スキップ処理部４８２が、図３に示した状態変数スキップ処理（ステップＳ３、Ｓ５、Ｓ７、・・・）を１００回繰り返して実行する。 For example, if 100 random numbers are required for one trial, in step S507, the state variable skip processing unit 482 performs the state variable skip processing (steps S3, S5, S7,...) Shown in FIG. Repeat 100 times.

この時点で、第１スレッド４４０のローカル記憶部４４７には、ステップＳ５０６で取得された状態変数が保存されている。また、状態変数生成記憶部４８１には、次の試行で用いる最初の乱数に対応する状態変数が保存されている。 At this time, the state variable acquired in step S506 is stored in the local storage unit 447 of the first thread 440. The state variable generation storage unit 481 stores a state variable corresponding to the first random number used in the next trial.

次にステップＳ５０８では、第１スレッド４４０の排他ロック制御部４４２が、状態変数生成記憶部４８１に対する排他ロックを解除する。その結果、第２から第４のスレッド
４５０、４６０、４７０が状態変数生成記憶部４８１へアクセスできることとなる。 In step S508, the exclusive lock control unit 442 of the first thread 440 releases the exclusive lock on the state variable generation storage unit 481. As a result, the second to fourth threads 450, 460, and 470 can access the state variable generation storage unit 481.

ステップＳ５０９では、ローカル記憶部４４７に保存されている状態変数を呼び出し、乱数生成部４４５にその値を渡す。 In step S509, the state variable stored in the local storage unit 447 is called, and the value is passed to the random number generation unit 445.

ステップＳ５１０では、乱数生成部４４５が１００個の乱数を生成しながら、試行実行部４４６が１回の試行を実行する。その後、ステップＳ５０２に進む。 In step S510, the trial execution unit 446 executes one trial while the random number generation unit 445 generates 100 random numbers. Thereafter, the process proceeds to step S502.

以上のようにして、第１スレッド４４０は、自己が担当するモンテカルロ法を実行することができる。第２スレッド４５０、第３スレッド４６０、第４スレッド４７０も、図５に示した処理を実行することにより、自己が担当するモンテカルロ法をそれぞれ実行することができる。その結果、第１スレッド４４０、第２スレッド４５０、第３スレッド４６０、第４スレッド４７０が、全ての試行を並列に実行することができる。 As described above, the first thread 440 can execute the Monte Carlo method that it is in charge of. The second thread 450, the third thread 460, and the fourth thread 470 can also execute the Monte Carlo method that they are responsible for by executing the processing shown in FIG. As a result, the first thread 440, the second thread 450, the third thread 460, and the fourth thread 470 can execute all trials in parallel.

上記実施形態を通して説明したように、本発明においては、排他制御を利用しつつ、状態変数の情報を全スレッドで共有することにより、無駄な状態変数のスキップ処理を排除している。 As described above, the present invention eliminates useless state variable skip processing by sharing state variable information among all threads while using exclusive control.

図６は、上記実施形態による各スレッドの試行の担当例を示す図である。横軸は第１スレッドから第４スレッドを示しており、縦軸は時間ｔを示している。図１および図２と同様に、符号Ａは、後続で行うモンテカルロ法の試行に用いる乱数の生成処理を示している。符号Ｂは、試行の実行処理を示している。符号Ｂの後に続くかっこ書きの数字は、試行番号である。ここで、従来技術１を示す図１および従来技術２を示す図２とは異なり、上記実施形態においては乱数捨象処理を表す符号Ｃが各スレッドに分散しており、処理負荷の均一化が図られていることがわかる。このように各スレッドの負荷が均一化できるのは、状態変数生成記憶部を利用することにより、試行に用いる状態変数の情報を全スレッドが共有しているためである。 FIG. 6 is a diagram illustrating an example of a trial assignment of each thread according to the embodiment. The horizontal axis indicates the first to fourth threads, and the vertical axis indicates time t. Like FIG. 1 and FIG. 2, the code | symbol A has shown the production | generation process of the random number used for the trial of the subsequent Monte Carlo method performed. A symbol B indicates trial execution processing. The numbers in parentheses following the symbol B are trial numbers. Here, unlike FIG. 1 showing the prior art 1 and FIG. 2 showing the prior art 2, in the above embodiment, the codes C representing the random number round-off process are distributed among the threads, and the processing load is made uniform. You can see that The reason why the load of each thread can be equalized in this way is that all the threads share state variable information used for trials by using the state variable generation and storage unit.

図６に示した例では、第１スレッドは、ｎ＝２，６，・・・，９９９８の試行を実行し、第２スレッドは、ｎ＝１，５，・・・，９９９７の試行を実行し、第３スレッドは、ｎ＝４，８，・・・，１００００の試行を実行し、第４スレッドは、ｎ＝３，７，・・・，９９９９の試行を実行している。ただし、各スレッドに対する試行番号の割り当てはＯＳの並列タスク・スケジューリングにより自動的に決められるため、一般に各スレッドに割り当てられる試行番号は実行のたびに異なるものとなる。また、各スレッドが実行する試行回数は必ずしも同一の回数になるとも限らない。一方で、本発明の方法では、同一の試行番号の試行には必ず同じ乱数が割り当てられるため、モンテカルロ法の実行結果には再現性がある。 In the example shown in FIG. 6, the first thread executes trials of n = 2, 6,..., 9998, and the second thread executes trials of n = 1, 5,. The third thread executes trials of n = 4, 8,..., 10000, and the fourth thread executes trials of n = 3, 7,. However, since the assignment of the trial number to each thread is automatically determined by the parallel task scheduling of the OS, in general, the trial number assigned to each thread is different for each execution. Further, the number of trials executed by each thread is not necessarily the same. On the other hand, in the method of the present invention, since the same random number is always assigned to trials with the same trial number, the execution result of the Monte Carlo method is reproducible.

マルチタスクＯＳでは、試行実行中であっても、ＯＳによってプロセッサコアに他のタスクが割り当てられることがあり、計算量が均等になるように各スレッドに処理を割り当てたとしても、各プロセッサコアが処理の終了までに要する時間は必ずしも均一になるとは限らない。本発明の方法によれば、他のタスクによる負荷が少ないプロセッサコアに自動的に多くの試行処理が割り当てられ、プロセッサコア間の負荷が自動的に分散化されるという利点もある。 In a multitasking OS, even if trial execution is being performed, other tasks may be assigned to the processor core by the OS, and even if processing is assigned to each thread so that the amount of calculation is equalized, each processor core The time required to complete the process is not always uniform. According to the method of the present invention, there is an advantage that a large number of trial processes are automatically assigned to a processor core that is lightly loaded by other tasks and the load among the processor cores is automatically distributed.

上記実施形態における並列処理は、第１００００番目の試行を終えた時点で、処理終了となる。第１００００番目の試行を終えるまでに要する理論的な計算時間ｔ_ＭＣは、以下のように表現することができる。

ここで、Ｎ_Ｓｉｍはモンテカルロ法の試行回数であり、Ｎ_Ｔは試行を実行するスレッドの数であり、ｔ_Ｓｉｍは１回の試行に要する時間であり、ｔ_Ｓｋｉｐは１回の試行に用いる乱数のスキップ処理に要する時間である。 The parallel processing in the above embodiment ends when the 10,000th trial is completed. The theoretical calculation time t _MC required to finish the 10000th trial can be expressed as follows.

Here, N _Sim is the number of trials of the Monte Carlo method, _NT is the number of threads executing the trial, t _Sim is the time required for one trial, and t _Skip is a random number used for one trial. This is the time required for the skip process.

スレッド数Ｎ_Ｔ≦ｔ_Ｓｉｍ／ｔ_Ｓｋｉｐの場合には、試行の実行処理を行っている間に、他のスレッドのスキップ処理が終わるため、排他ロックによる処理の待ち時間は生じない。この場合、式（４）からわかるように、スレッド数Ｎ_Ｔを増加させると、それに応じて計算時間ｔ_ＭＣは短くなる。他方、スレッド数Ｎ_Ｔ≧ｔ_Ｓｉｍ／ｔ_Ｓｋｉｐの場合には、スレッド数Ｎ_Ｔを増加させても計算時間ｔ_ＭＣは変化しない。これは、この場合には排他ロックによる処理の待ち時間が生じるためである。 In the case of the number of threads N _T ≦ t _Sim / t _Skip , since the skip process of other threads is completed while the trial execution process is being performed, the processing wait time due to the exclusive lock does not occur. In this case, as can be seen from equation (4), increasing the number of threads N _T, the calculation time t _MC becomes shorter accordingly. On the other hand, when the number of threads N _T ≧ t _Sim / t _Skip , the calculation time t _MC does not change even if the number of threads N _T is increased. This is because in this case, processing wait time due to exclusive lock occurs.

図７に示す表１は、本発明の一実施形態と、上述した従来技術１との理論的な計算時間の比較を表している。横軸はスレッド数であり、縦軸はｔ_Ｓｉｍ／ｔ_ｓｋｉｐである。そして、表内の値は、（本発明の一実施形態による理論的な計算時間）／（従来技術１の理論的な計算時間）である。 Table 1 shown in FIG. 7 represents a comparison of theoretical calculation times between the embodiment of the present invention and the above-described related art 1. The horizontal axis represents the number of threads, and the vertical axis represents t _Sim / t _skip . The values in the table are (theoretical calculation time according to an embodiment of the present invention) / (theoretical calculation time of the prior art 1).

ｔ_Ｓｉｍ／ｔ_ｓｋｉｐの値が大きくほど、乱数発生以外の処理に掛かる時間が長い複雑な試行であることを意味する。なお、ｔ_Ｓｉｍ／ｔ_ｓｋｉｐ＝１の場合、すなわち、１回の試行に要する時間ｔ_Ｓｉｍと、１回の試行に用いる乱数のスキップ処理に要する時間ｔ_Ｓｋｉｐとが等しい場合は、乱数を生成させただけで、乱数を用いた演算を行わないことを意味している。これは、実用的には意味のないものであるが、計算時間の比較として参考までに載せるものである。 The larger the value of t _Sim / t _skip, the more complicated the trial takes for the processing other than random number generation. If t _Sim / t _skip = 1, that is, if the time t _Sim required for one trial is equal to the time t _Skip required for skipping the random number used for one trial, a random number is generated. This means that no calculation using random numbers is performed. This is meaningless in practical use, but is included for reference as a comparison of calculation time.

表１によれば、スレッド数を２０とし、ｔ_Ｓｉｍ／ｔ_ｓｋｉｐ＝２０である場合には、（本発明の一実施形態による理論的な計算時間）／（従来技術１の理論的な計算時間）＝０．５４である。すなわち、従来技術１と比べて計算時間をほぼ半減できることを意味している。 According to Table 1, when the number of threads is 20 and t _Sim / t _skip = 20, (theoretical calculation time according to an embodiment of the present invention) / (theoretical calculation time of prior art 1) ) = 0.54. That is, it means that the calculation time can be almost halved as compared with the prior art 1.

図８に示す表２は、本発明の一実施形態と、上述した従来技術２との理論的な計算時間の比較を表している。表１と同様に、横軸はスレッド数であり、縦軸はｔ_Ｓｉｍ／ｔ_ｓｋｉｐである。そして、表内の値は、（本発明の一実施形態による理論的な計算時間）／（従来技術２の理論的な計算時間）である。なお、従来技術２では、理論的な計算時間を達成するためにモンテカルロ法実行前にｔ_Ｓｉｍとｔ_ｓｋｉｐを計測する必要があるが、ここでの比較ではその計測に要する時間は考慮していない。 Table 2 shown in FIG. 8 represents a comparison of theoretical calculation time between the embodiment of the present invention and the above-described related art 2. Similar to Table 1, the horizontal axis represents the number of threads, and the vertical axis represents t _Sim / t _skip . The value in the table is (theoretical calculation time according to one embodiment of the present invention) / (theoretical calculation time of the prior art 2). In the prior art 2, it is necessary to measure t _Sim and t _skip before executing the Monte Carlo method in order to achieve a theoretical calculation time. However, in this comparison, the time required for the measurement is not considered. .

表２によれば、スレッド数を２０とし、ｔ_Ｓｉｍ／ｔ_ｓｋｉｐ＝２０である場合には、（本発明の一実施形態による理論的な計算時間）／（従来技術２の理論的な計算時間）＝０．６７である。すなわち、従来技術１より優れている従来技術２と比較しても計算時間をほぼ３分の２に短縮できることを意味している。 According to Table 2, when the number of threads is 20 and t _Sim / t _skip = 20, (theoretical calculation time according to an embodiment of the present invention) / (theoretical calculation time of the prior art 2) ) = 0.67. That is, it means that the calculation time can be shortened to almost two thirds compared with the conventional technique 2 which is superior to the conventional technique 1.

［実施例］
以下、モンテカルロ法を用いて、半径０．５の円の面積を求める実施例について説明する。まず、図９は、モンテカルロ法を並列にではなく、逐次的に行うプログラム例である。ここでは、プログラミング言語としてＣ＋＋言語を用いる。図９に示されているプログラムの各行の左側に示されている「１」、「２」、・・・、「２０」は、プログラムの行番号を表すものであって、プログラム自体を構成するものではない。 [Example]
Hereinafter, an embodiment in which the area of a circle having a radius of 0.5 is obtained using the Monte Carlo method will be described. First, FIG. 9 shows an example of a program that performs the Monte Carlo method sequentially rather than in parallel. Here, a C ++ language is used as a programming language. “1”, “2”,..., “20” shown on the left side of each line of the program shown in FIG. 9 represents the line number of the program and constitutes the program itself. It is not a thing.

第１行目の「AreaOfCircle」は、モンテカルロ法により半径０．５の円の面積を求める関数である。この関数の引数「ＮＳｉｍ」は試行回数を表す変数であり、引数「ｓｅｅｄ」は、擬似乱数を生成する元となる値である。関数「AreaOfCircle」が第２行目から第２０行目において定義されている。 “AreaOfCircle” on the first line is a function for obtaining the area of a circle having a radius of 0.5 by the Monte Carlo method. The argument “NSim” of this function is a variable representing the number of trials, and the argument “seed” is a value from which a pseudo-random number is generated. The function “AreaOfCircle” is defined in the 2nd to 20th lines.

第４行目の変数「ｘ」および「ｙ」は、生成した乱数をｘ座標値およびｙ座標値とするための変数である。第５行目の変数「ｒ＿ｓｑ」は、半径０．５の円の中心点（０．５，０．５）と、乱数生成により求めたＸＹ平面上の点との距離を二乗した値を保存する変数である。第６行目の変数「ｓ」は、乱数生成により求めたＸＹ平面上の複数の点のうち、半径０．５の円内に含まれる点の数を保存する変数である。 The variables “x” and “y” on the fourth line are variables for setting the generated random number as the x coordinate value and the y coordinate value. The variable “r_sq” in the fifth row stores a value obtained by squaring the distance between the center point (0.5, 0.5) of a circle having a radius of 0.5 and a point on the XY plane obtained by random number generation. Variable. The variable “s” on the sixth line is a variable for storing the number of points included in a circle having a radius of 0.5 among a plurality of points on the XY plane obtained by random number generation.

第７行目の「RandomNumberGenerator」は、（０，１）区間の一様乱数を生成するクラスである。その詳細については後述する。第１０行目では、関数「AreaOfCircle」の引数「ｓｅｅｄ」の初期化を行う。 “RandomNumberGenerator” on the seventh line is a class that generates a uniform random number in the (0, 1) interval. Details thereof will be described later. In the 10th line, the argument “seed” of the function “AreaOfCircle” is initialized.

第１３行目から第１８行目は、モンテカルロ法を行うループである。試行番号を表す変数ＳｉｍＮｏ＝０から始めてＮ_Ｓｉｍ回の試行を逐次的に行う。第１４行目および第１５行目でそれぞれ１個の乱数を生成して、それぞれの値をＸ座標値およびＹ座標値とする。これにより、ＸＹ平面上の点をランダムに定めることができる。第１６行目では、ピタゴラスの定理を用いて、第１４行目および第１５行目で定められたＸＹ平面上の点と、円の中心点（０．５、０．５）との距離の二乗を計算し、変数ｒ＿ｓｑに保存する。第１７行目では、ｒ＿ｓｑと０．２５（＝０．５＊０．５）との大小関係を判断して、ｒ＿ｓｑ＜０．２５を満たす場合、すなわち、ランダムに定められたＸＹ平面上の点が半径０．５の円内に含まれる場合に、変数ｓをインクリメントする。そして、第１９行目のｓ／Ｎ_Ｓｉｍが関数AreaOfCircleの戻り値、すなわち、求められた円の面積である。試行の実行回数Ｎ_Ｓｉｍの値が大きければ大きいほど、求まる円の面積は、π＊０．５＊０．５に近くなる。 The 13th to 18th lines are loops for performing the Monte Carlo method. Starting with the variable SimNo = 0 indicating the trial number, N _Sim trials are sequentially performed. In the 14th and 15th lines, one random number is generated, and the respective values are set as the X coordinate value and the Y coordinate value. Thereby, points on the XY plane can be determined randomly. In the 16th line, using the Pythagorean theorem, the distance between the point on the XY plane defined in the 14th line and the 15th line and the center point (0.5, 0.5) of the circle is calculated. The square is calculated and stored in the variable r_sq. In the 17th line, the magnitude relationship between r_sq and 0.25 (= 0.5 * 0.5) is determined, and when r_sq <0.25 is satisfied, that is, on a randomly defined XY plane. If the point is contained within a circle with a radius of 0.5, the variable s is incremented. The s / N _{Sim on} the 19th line is the return value of the function AreaOfCircle, that is, the area of the obtained circle. The larger the value of the number of trial executions N _Sim, the closer the obtained circle area is to π * 0.5 * 0.5.

図１０は、図９のプログラム例を並列化したプログラム例である。図１０に示すプログラム例では、並列計算を行うための標準規格であるＯｐｅｎＭＰを用いて並列化を行っている。ＯｐｅｎＭＰは、主に共有メモリ型の並列計算機で用いられている。 FIG. 10 is a program example in which the program example of FIG. 9 is parallelized. In the program example shown in FIG. 10, parallelization is performed using OpenMP, which is a standard for performing parallel computation. OpenMP is mainly used in shared memory parallel computers.

第１行目は、図９と同様に、変数Ｎ_Ｓｉｍおよび変数ｓｅｅｄを引数とする関数AreaOfCircleの宣言文である。 The first line is a declaration statement of the function AreaOfCircle having the variable N _Sim and the variable seed as arguments, as in FIG.

第５行目は、状態変数の値を取得するための乱数生成クラスRandomNumberGeneratorを宣言している。この乱数生成クラスRandomNumberGeneratorは、全てのスレッドが共有するクラスである。後に図１２に示すように、このクラスには状態変数に相当するメンバ変数が含まれており、状態変数生成記憶部４８１の役割を果たすことができる。また、状態変数のスキップ処理や状態変数の値の取得処理、乱数の生成処理を行うメソッドも含まれている。 The fifth line declares a random number generation class RandomNumberGenerator for acquiring the value of the state variable. This random number generation class RandomNumberGenerator is a class shared by all threads. As shown in FIG. 12 later, this class includes member variables corresponding to state variables, and can play a role of the state variable generation storage unit 481. Also included are methods that perform state variable skip processing, state variable value acquisition processing, and random number generation processing.

第６行目の変数ｓは、図９と同様に、乱数生成により求めたＸＹ平面上の複数の点のうち、半径０．５の円内に含まれる点の数を保存する変数である。 The variable s in the sixth row is a variable for storing the number of points included in a circle having a radius of 0.5 among a plurality of points on the XY plane obtained by random number generation, as in FIG.

第１２行目から第３２行目は、各スレッドが並列に処理を実行する部分である。すなわち、変数ＳｉｍＮｏ＝０から始めてＮ_Ｓｉｍ回の試行を図４に示したスレッド４４０、４５０、４６０、４７０が分担して実行する。 The twelfth to thirty-second lines are portions where the threads execute processing in parallel. That is, starting from the variable SimNo = 0, N _Sim times of trials are shared and executed by the threads 440, 450, 460, and 470 shown in FIG.

第２２行目および第２７行目は、状態変数生成記憶部４８１に対する排他ロックの制御を行う処理である。これは、図４の排他ロック制御部４４２、４５２、４６２、４７２が行う処理に相当する。この処理により、あるスレッドが乱数の生成処理を行っている場合（第２５、２６行目）においては、他のスレッドは、乱数生成処理を実行することができなくなる。 The 22nd and 27th lines are processes for performing exclusive lock control on the state variable generation storage unit 481. This corresponds to the processing performed by the exclusive lock control units 442, 452, 462, and 472 in FIG. With this process, when a certain thread is performing a random number generation process (lines 25 and 26), other threads cannot execute the random number generation process.

第３３行目で計算するｓ／Ｎ_Ｓｉｍは、関数AreaOfCircleの戻り値、すなわち、求められた円の面積である。この計算は、図４の結果出力部４９０の処理に相当する。 The s / N _Sim calculated in the 33rd line is the return value of the function AreaOfCircle, that is, the area of the obtained circle. This calculation corresponds to the processing of the result output unit 490 in FIG.

第１２行目、第１９行目、第２２行目の記述は、ＯｐｅｎＭＰによる並列処理の指示を示している。 The descriptions on the 12th, 19th, and 22nd lines indicate instructions for parallel processing by OpenMP.

なお、図１０に示したプログラム例では、図４で示したブロック図とは異なり、乱数生成部は全スレッドに共有されており、各スレッドには状態変数取得部や乱数生成部は存在しない。一方で、排他制御を利用しつつ状態変数の情報を全スレッドで共有する、という発明の特徴は備えている。このような例も、本発明の範囲に含まれる。 In the program example shown in FIG. 10, unlike the block diagram shown in FIG. 4, the random number generation unit is shared by all threads, and there is no state variable acquisition unit or random number generation unit in each thread. On the other hand, the invention has a feature that information on state variables is shared by all threads while using exclusive control. Such an example is also included in the scope of the present invention.

図１０のプログラム例では、１回の試行に用いる乱数は２個と少ないが、１回の試行に多くの乱数を使用するシミュレーションでは、各スレッドで使用するメモリの容量が増え、メモリへのアクセス回数も増える。このため、ハードウエアや分析の対象によっては、図１０と同様のプログラムではメモリへのアクセスに要する時間が無視できなくなる場合がある。 In the example of the program shown in FIG. 10, the number of random numbers used for one trial is as small as two. However, in a simulation using many random numbers for one trial, the memory capacity used in each thread increases, and the memory is accessed. The number of times also increases. For this reason, depending on the hardware and the object of analysis, a program similar to that shown in FIG.

図１０のプログラム例における排他制御部での処理を最小限に留めることにより、上記の処理時間増大の可能性を軽減する方法も考えられる。それが、図１１で示すプログラム例である。 A method of reducing the possibility of the above increase in processing time by minimizing the processing in the exclusive control unit in the program example of FIG. That is the program example shown in FIG.

第５行目は、図１０と同様に、全てのスレッドで共有する、乱数生成クラスRandomNumberGeneratorを宣言している。このクラスには状態変数にあたるメンバ変数が含まれており、これによって状態変数の情報が全クラスで共有されることとなる。 The fifth line declares a random number generation class RandomNumberGenerator shared by all threads, as in FIG. This class includes a member variable corresponding to a state variable, so that information on the state variable is shared by all classes.

第６行目では、１回のモンテカルロ法で用いる乱数の個数を表す変数ＮＲａｎｄを定めている。本例では、ＮＲａｎｄ＝２と定めている。これは、１回の試行につき、Ｘ座標値とするための乱数と、Ｙ座標値とするための乱数とをあわせた合計２個の乱数が必要となるからである。 The sixth line defines a variable NRand representing the number of random numbers used in one Monte Carlo method. In this example, NRand = 2. This is because, for each trial, a total of two random numbers including a random number for setting the X coordinate value and a random number for setting the Y coordinate value are required.

第２８行目および第３４行目は、状態変数生成記憶部４８１に対する排他ロックの制御を行う処理である。この処理により、あるスレッドが状態変数の値の取得（第３１行目）を行っている場合と、状態変数のスキップ処理（第３３行目）を行っている場合とにおいては、他のスレッドは、状態変数の値の取得およびスキップ処理をいずれも行うことができなくなる。 The 28th line and the 34th line are processes for controlling the exclusive lock for the state variable generation storage unit 481. With this process, when a certain thread is acquiring the value of the state variable (line 31) and when performing a state variable skip process (line 33), the other threads Therefore, neither the acquisition of the value of the state variable nor the skip process can be performed.

第３７行目では、先に取得した状態変数を各スレッドの乱数生成クラスにセットする。そして、第３８行目および第３９行目で乱数を生成して、それぞれＸ座標値およびＹ座標値とする。 In the 37th line, the previously obtained state variable is set in the random number generation class of each thread. Then, random numbers are generated on the 38th and 39th lines, and set as the X coordinate value and the Y coordinate value, respectively.

図１２は、図９、図１０、図１１の各プログラム例で用いる乱数生成クラス「RandomNumberGenerator」を宣言するプログラム例である。図１２の第１０行目では、状態変数に相当するメンバ変数ｓｔａｔｅ＿ｖｅｃが宣言されている。この変数に状態変数の値を格納することにより、このクラスは状態変数生成記憶部４８１の役割を果たすことができる。また、このクラスRandomNumberGeneratorには、乱数の生成処理を行うメソッド（第２３行目）や、状態変数のスキップ処理を行うメソッド（第２６行目）、状態変数の値の取得処理を行うメソッド（第２９行目）が用意されている。これらは、図４のブロック図でいえば、それぞれ乱数生成部（符号４４５、４５５、４６５、４７５）、状態変数スキップ処理部（符号４８２）、状態変数取得部（符号４４３、４５３、４６３、４７３）に相当する役割を果たすものである。このクラスの実際の処理は、乱数生成アルゴリズムとして、メルセンヌ・ツイスター法、線形合同法、あるいはその他の方法のいずれを使用するかによって変わる。 FIG. 12 shows an example of a program for declaring a random number generation class “RandomNumberGenerator” used in each of the example programs of FIGS. 9, 10, and 11. In the 10th line in FIG. 12, a member variable state_vec corresponding to the state variable is declared. By storing the value of the state variable in this variable, this class can serve as the state variable generation storage unit 481. The class RandomNumberGenerator includes a method for generating a random number (the 23rd line), a method for performing a state variable skipping process (the 26th line), and a method for performing a state variable value acquiring process (the 2nd line). (29th line) is prepared. In the block diagram of FIG. 4, these are respectively a random number generator (reference numerals 445, 455, 465, and 475), a state variable skip processing section (reference numeral 482), and a state variable acquisition section (reference numerals 443, 453, 463, and 473). ). The actual processing of this class varies depending on whether the Mersenne Twister method, linear congruential method, or other methods are used as the random number generation algorithm.

並列計算機は、共有メモリ型並列計算機と、分散メモリ型並列計算機とに大別できる。共有メモリ型並列計算機は、全てのプロセッサコアがメモリを共同で使用するタイプの計算機である。このメモリは、共有メモリと呼ばれている。共有メモリ型並列計算機は、プロセッサコア間のデータの交換が容易であるという長所を有する。一方で、プロセッサコア数が多い場合には、共有メモリへの書き込みが競合し、性能が低下する場合があるという短所を有する。共有メモリ型並列計算機への実装にはＯｐｅｎＭＰを用いることができる。 Parallel computers can be broadly divided into shared memory type parallel computers and distributed memory type parallel computers. The shared memory type parallel computer is a type of computer in which all processor cores share memory. This memory is called a shared memory. The shared memory parallel computer has an advantage that data exchange between processor cores is easy. On the other hand, when the number of processor cores is large, writing to the shared memory competes and there is a disadvantage that the performance may be reduced. OpenMP can be used for mounting on a shared memory parallel computer.

分散メモリ型並列計算機は、図４に示した第１から第４のスレッド４４０、４５０、４６０、４７０のそれぞれを、個別にメモリを持ついくつかのプロセッサにより実行することができるものである。プロセッサ間でデータを交換するためには、プロセッサ間で通信を行う必要がある。分散メモリ型並列計算機への実装には、ＭＰＩ（Message Passing Interface）を用いることができる。 The distributed memory type parallel computer can execute each of the first to fourth threads 440, 450, 460, and 470 shown in FIG. 4 by several processors having separate memories. In order to exchange data between processors, it is necessary to communicate between processors. MPI (Message Passing Interface) can be used for mounting on a distributed memory parallel computer.

上記実施形態は、共有メモリ型並列計算機、および分散メモリ型並列計算機のいずれにも実装することができる。すなわち、図４に示した第１から第４のスレッド４４０、４５０、４６０、４７０の全てが、状態変数の情報を共有できさえすれば、共有メモリ型並列計算機、および分散メモリ型並列計算機のいずれにも実装することができる。また、各スレッドを１個のプロセッサコアが担当するように構成することもできるし、各スレッド自体を２個以上のプロセッサコアにより処理することもできる。 The above embodiment can be implemented in either a shared memory type parallel computer or a distributed memory type parallel computer. That is, as long as all of the first to fourth threads 440, 450, 460, and 470 shown in FIG. 4 can share the state variable information, any of the shared memory type parallel computer and the distributed memory type parallel computer can be used. Can also be implemented. In addition, each thread can be configured to be handled by one processor core, or each thread itself can be processed by two or more processor cores.

なお、本発明の各実施の形態を実装するためのコンピュータのハードウエア構成は特段図示した構成に限定されるものではなく、数値演算の形式に併せて専用の数値演算ユニットを一つ以上有していたり、複数の筐体に分かれて互いにネットワークにより接続されているクラスタ構成にされていたりすることができる。なお、本実施例のみならず、本発明全般に、区別して記載された機能手段は、実質的にそのような区別された機能を果たす任意の構成要素によって実現される。このとき、その構成要素が、物理的にいくつの数を有するか、あるいは、複数であり互いにどのような位置関係にあるかなどの機能を果たす上で制限とならない属性によって本発明が制限されることはない。例えば、複数の区別された機能が単一の構成要素によって経時的に異なるタイミングで実行されることも本発明の範囲に含まれる。 Note that the hardware configuration of the computer for implementing each embodiment of the present invention is not limited to the configuration specifically shown, and has one or more dedicated numerical arithmetic units in accordance with the numerical arithmetic format. Or a cluster configuration that is divided into a plurality of housings and connected to each other via a network. It should be noted that the functional means described separately in the present invention as well as in the present embodiment are realized by any component that substantially performs such a distinguished function. At this time, the present invention is limited by an attribute that is not limited in performing functions such as how many physical components there are, or a plurality of components and in what positional relationship with each other. There is nothing. For example, it is within the scope of the present invention that multiple distinct functions are performed at different times over time by a single component.

本発明の各実施の形態の機能処理を実装するコンピュータにおいて数値計算をするためのソフトウエア構成は、本発明の各実施の形態の数値情報処理を実現する限り任意の構成とすることができる。そのコンピュータは、基本入出力システム（ＢＩＯＳ）などのハードウエア制御のためのソフトウエアを搭載しており、これと連携して動作し、ファイル入出力やハードウエアリソースの割り振りを担当するオペレーティングシステム（ＯＳ）によって管理されている。当該ＯＳは、ＯＳやハードウエアと連携して動作するアプリケーションプログラムを、例えばユーザーからの明示の命令や、ユーザーからの間接的な命令や他のプログラムからの命令に基づいて実行することができる。アプリケーションプログラムは、このような動作を可能とし、ＯＳと関連して動作するように、ＯＳの規定する手続に依存して、あるいはＯＳに依存しないように適切にプログラムされている。本発明の各実施の形態を実装する場合には、一般に、専用のアプリケーションプログラムの形式で数値計算やファイル入出力等の処理を実装するが、本発明がそれのみに限定されるものではなく、複数の専用または汎用アプリケーションプログラムを用いたり、既成の数値計算ライブラリを部分的に用いたり、他のコンピュータのハードウエアによって処理されるようにネットワークプログラミング手法によって実現されていたり、その他の任意の実装形態によって実現されうる。したがって、本発明の各実施の形態の計算手法をコンピュータ上に実装するための一連の命令を表現するソフトウエアを、単にプログラムと呼ぶ。プログラムは、コンピュータにより実行可能な任意の形式あるいはそのような形式に最終的に変換可能な任意の形式によって表現される。 The software configuration for performing numerical calculation in the computer that implements the functional processing of each embodiment of the present invention can be any configuration as long as the numerical information processing of each embodiment of the present invention is realized. The computer is equipped with software for hardware control, such as a basic input / output system (BIOS), and operates in conjunction with this to operate an operating system (in charge of file input / output and allocation of hardware resources). OS). The OS can execute an application program that operates in cooperation with the OS or hardware based on, for example, an explicit instruction from the user, an indirect instruction from the user, or an instruction from another program. The application program is appropriately programmed so as to enable such operations and to operate in association with the OS, depending on the procedure defined by the OS or not depending on the OS. When implementing each embodiment of the present invention, generally, processing such as numerical calculation and file input / output is implemented in the form of a dedicated application program, but the present invention is not limited thereto, Uses multiple dedicated or general purpose application programs, partially uses pre-made numerical libraries, is implemented by network programming techniques to be processed by other computer hardware, or any other implementation Can be realized. Therefore, software that expresses a series of instructions for implementing the calculation method of each embodiment of the present invention on a computer is simply called a program. The program can be expressed in any format that can be executed by a computer or in any format that can ultimately be converted into such a format.

本発明の各実施の形態のプログラムは、ハードウエア資源であるＭＰＵなどの演算手段が、ＯＳを介してあるいはＯＳを介することなく計算プログラムからの指令を受け、ハードウエア資源であるメインメモリや補助記憶装置などの記憶手段と協働して、ハードウエア資源である適当なバスなどを通じて演算処理を行うように構成される。つまり、本発明の各実施の形態の計算手法を実現するソフトウエアによる情報処理が、これらのハードウエア資源によって実現されるように実装される。記憶手段あるいは記憶部は、任意の単位によって論理的に区分されているコンピュータが可読な情報記憶媒体の一部または全部またはそれらの組み合わせをいう。この記憶手段は、例えば、ＭＰＵ内のキャッシュメモリや、ＭＰＵと接続されたメインメモリや、ＭＰＵと適当なバスによって接続されたハードディスクドライブなどの不揮発性記憶媒体など、任意のハードウエア資源によって実現される。ここで、記憶手段は、ＭＰＵのアーキテクチャによって規定されるメモリ内の領域や、ＯＳが管理するファイルシステム上のファイルやフォルダ、同じコンピュータ内やネットワーク上のいずれかのコンピュータにあってアクセス可能なデータベースマネージメントシステム内のリストやレコード、リレーショナルデータベースによって相互にリレーションがある複数のリストで管理されたレコードなど任意の形式によって実現され、論理的に他と区分され、情報を識別可能に少なくとも一時的に記憶または記録できる任意のものを含む。 In the program according to each embodiment of the present invention, an arithmetic unit such as an MPU that is a hardware resource receives a command from a calculation program via the OS or without going through the OS, and the main memory or auxiliary memory that is the hardware resource. In cooperation with a storage means such as a storage device, it is configured to perform arithmetic processing through an appropriate bus as a hardware resource. That is, the information processing by software that realizes the calculation method of each embodiment of the present invention is implemented so as to be realized by these hardware resources. The storage means or storage unit refers to a part or all of a computer-readable information storage medium that is logically divided by an arbitrary unit, or a combination thereof. This storage means is realized by an arbitrary hardware resource such as a cache memory in the MPU, a main memory connected to the MPU, or a non-volatile storage medium such as a hard disk drive connected to the MPU by an appropriate bus. The Here, the storage means is an area in the memory defined by the MPU architecture, a file or folder on the file system managed by the OS, or a database accessible in any computer on the same computer or network. Realized in any format, such as a list or record in a management system, or a record managed by multiple lists that are related to each other by a relational database, logically separated from the others, and at least temporarily stored so that information can be identified Or anything that can be recorded.

４００並列処理システム
４１０入力装置
４２０表示装置
４３０並列処理制御部
４４０第１スレッド
４５０第２スレッド
４６０第３スレッド
４７０第４スレッド
４４１、４５１、４６１、４７１試行回数計数部
４４２、４５２、４６２、４７２排他ロック制御部
４４３、４５３、４６３、４７３状態変数取得部
４４５、４５５、４６５、４７５乱数生成部
４４６、４５６、４６６、４７６試行実行部
４４７、４５７、４６７、４７７ローカル記憶部
４８０グローバル記憶部
４８１状態変数生成記憶部
４８２状態変数スキップ処理部
４９０結果出力部 400 Parallel processing system 410 Input device 420 Display device 430 Parallel processing control unit 440 First thread 450 Second thread 460 Third thread 470 Fourth thread 441, 451, 461, 471 Trial count unit 442, 452, 462, 472 Exclusive Lock control unit 443, 453, 463, 473 State variable acquisition unit 445, 455, 465, 475 Random number generation unit 446, 456, 466, 476 Trial execution unit 447, 457, 467, 477 Local storage unit 480 Global storage unit 481 State Variable generation storage unit 482 State variable skip processing unit 490 Result output unit

Claims

A parallel processing method in which execution of a Monte Carlo method by a computer is processed in parallel using a state variable generation storage unit and a plurality of threads,
Storing the state variable in the state variable generation storage unit;
In a state where the processor core of one thread can access only the processor core of the one thread under exclusive control, the state variable stored in the state variable generation storage unit is sequentially skipped a predetermined number of times. And a random number generation step for generating a random number by performing a conversion process,
The processor core of the one thread includes a trial calculation step of performing a trial calculation assigned to the one thread using the generated random number, and the random number generation step and the trial calculation step are performed in other threads as well. A parallel processing method that executes sequentially in parallel.

The random number generation step includes:
A step in which a processor core determines in one thread whether or not an exclusive lock is applied to a state variable generation storage unit in which a state variable used for generating a random number is stored;
If the state variable generation storage unit is not locked exclusively, the processor core of the one thread locks the state variable generation storage unit;
A processing step in which the processor core of the one thread sequentially performs a skip process and a conversion process over a predetermined number of times with respect to the state variables stored in the state variable generation storage unit;
The parallel processing method according to claim 1, further comprising: a processor core of the one thread releasing an exclusive lock on the state variable generation storage unit.

A parallel processing method in which execution of a Monte Carlo method by a computer is processed in parallel using a state variable generation storage unit and a plurality of threads,
Storing the state variable in the state variable generation storage unit;
A processor core of one thread obtains a state variable stored in the state variable generation storage unit in a state where only the processor core of the one thread can access under exclusive control, and for the state variable A state variable skip processing step for sequentially performing skip processing for a predetermined number of times;
The processor core of the one thread generates a random number by performing conversion processing and skip processing for a predetermined number of times for the acquired state variable;
The processor core of the one thread performs a trial operation step of performing an operation of the trial assigned to the one thread using the generated random number, and sequentially executing each step in parallel in other threads Parallel processing method.

The state variable skip processing step includes:
A step in which a processor core determines in one thread whether or not an exclusive lock is applied to a state variable generation storage unit in which a state variable used for generating a random number is stored;
If the state variable generation storage unit is not locked exclusively, the processor core of the one thread locks the state variable generation storage unit;
The processor core of the one thread acquires a value of a state variable from the state variable generation storage unit;
A processing step in which the processor core of the one thread sequentially performs a skip process over a predetermined number of times with respect to the state variables stored in the state variable generation storage unit;
The parallel processing method according to claim 3, wherein the processor core of the one thread releases an exclusive lock on the state variable generation storage unit.

A method for measuring a credit risk amount of a portfolio including a plurality of bonds, wherein the credit risk amount is measured by executing a Monte Carlo method using the method according to claim 1.

A method for measuring an amount of market risk caused by a change in a financial market by executing a Monte Carlo method using the method according to claim 1.

A method for calculating a price of a financial product by modeling a change in a financial market and executing a Monte Carlo method using the method according to claim 1.

A parallel processing system that executes a Monte Carlo method by a computer by parallel processing using a state variable generation storage unit and a plurality of threads,
A state variable generation and storage unit in which state variables used for generating random numbers are shared by the plurality of threads;
A random number generator that generates a random number by sequentially performing a conversion process and a skip process over a predetermined number of times for the state variables stored in the state variable generation storage unit,
For each thread, an exclusive lock control unit that applies an exclusive lock to the state variable generation storage unit or releases the exclusive lock; and
A parallel processing system comprising: an arithmetic processing execution unit that performs a trial calculation using the random number generated by the random number generation unit.

A parallel processing system that executes a Monte Carlo method by a computer by performing parallel processing using a state variable generation storage unit and a plurality of threads,
A state variable generation and storage unit in which state variables used for generating random numbers are shared by the plurality of threads;
A state variable skip processing unit that sequentially performs skip processing over a predetermined number of times for the state variables stored in the state variable generation storage unit,
For each thread, an exclusive lock control unit that applies an exclusive lock to the state variable generation storage unit or releases the exclusive lock; and
A state variable acquisition unit that acquires a value of the state variable from the state variable generation storage unit and stores the value in a memory;
A random number generator that performs a conversion process and a skip process on a state variable stored in the memory a predetermined number of times to generate a random number;
A parallel processing system comprising: an arithmetic processing execution unit that performs a trial calculation using the random number generated by the random number generation unit.

The parallel processing system according to claim 8 or 9, which measures a credit risk amount of a portfolio including a plurality of bonds.

The parallel processing system according to claim 8 or 9, which measures the amount of market risk caused by fluctuations in the financial market.

The parallel processing system according to claim 8 or 9, wherein in order to calculate a price of a financial product, a change in a financial market is modeled and a Monte Carlo method is executed using the parallel processing system.

The program for making a computer perform each step of the method in any one of Claims 1-7.