CN105843588A - MIC based random number generator segmented parallelizing method - Google Patents
MIC based random number generator segmented parallelizing method Download PDFInfo
- Publication number
- CN105843588A CN105843588A CN201610150661.1A CN201610150661A CN105843588A CN 105843588 A CN105843588 A CN 105843588A CN 201610150661 A CN201610150661 A CN 201610150661A CN 105843588 A CN105843588 A CN 105843588A
- Authority
- CN
- China
- Prior art keywords
- num
- thread
- random number
- threads
- mic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 13
- 238000004422 calculation algorithm Methods 0.000 claims description 9
- 230000000737 periodic effect Effects 0.000 abstract description 4
- 230000011218 segmentation Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000000205 computational method Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/58—Random or pseudo-random number generators
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Complex Calculations (AREA)
Abstract
The present invention discloses an MIC based random number generator segmented parallelizing method, so as to perform parallelizing by a method of segmenting a periodic sequence and splice random numbers generated by threads to form a final sequence. Compared with a CPU single thread, a speedup ratio under an MIC platform has a significant advantage.
Description
Technical field
The present invention relates to a kind of stagewise parallel method of general type randomizer, espespecially for Intel
The hardware of MIC framework carries out the method for random number sequence output.
Background technology
Randomizer is used to produce the device of random number, is generally divided into real random number generator and pseudo random number is sent out
Raw device.Scientific research at present and engineering simulation are growing, by parallel computation to performance and the rate requirement of randomizer
Technology is applied to randomizer can quickly improve generation efficiency.Additionally the parallelization resarch work of randomizer is main
Concentrate on multi-core central processing unit (Central Processing Unit, CPU) platform, lack based on Intel last word
The correlation theory foundation of MIC (Many Integrated Core) platform parallelization and performance evaluation.
Intel MIC (Many Integrated Core) framework has less kernel and more hardware thread, with
And broader vector units, it is to improve overall performance, meet the choosing of the ideal of highly-parallel application demand, it is based on x86 frame
Structure, supports the parallel programming models such as OpenMP, pThread.Xeon Phi based on MIC framework calculates accelerator card by 57~61
Physical treatment core is constituted, and each physical core comprises 4 hardware threads, and accelerator card memory on board size is 6GB~8GB, double
Accuracy computation peak computational ability reaches 1TFlops.It compares CPU advantageously in terms of parallel computation, solves the most also
Row computational problem.
Summary of the invention
The problem that invention is to be solved
It is an object of the present invention to overcome deficiency of the prior art, it is provided that a kind of random number based on MIC occurs
Device stagewise parallel method.
For solving the scheme of problem
A kind of randomizer stagewise parallel method based on MIC, comprises the following steps:
Step A, acquisition partiting step, including
Step A1, the cycle ρ of acquisition original random number sequence;
Step A2, obtain supported maximum thread N, every thread producible random number number
Step B, main thread read parameter, calculate each thread original state seed [id], including
Random number quantity random_num required for the reading of step B1, main thread, the Thread Count threads_num of distribution
And seed seed, wherein 0 < random_num < ρ, 0 < threads_num < N;
Step B2, set up internal memory interval result;
Step B3, calculate each thread need produce random number num=random_num/threads_num;
Step B4, the value arranging id are 1;
Step B5, calculating seed [id];
If step B6 id > threads_num, then forward step C to;
Step B7, id value, from increasing 1, are then back to step B5;
Step C, MIC thread calculate, including
Step C1, the initial value arranging i are 0, and the initial value of U is 0;
Step C2, by random number serial algorithm calculate U;
Step C3, result [id*num+i]=U;
If step C4 i > num, then terminate this thread;
Step C5, i value, from increasing 1, are then back to step C3.
The effect of invention
The present invention utilizes the method to periodic sequence segmentation to carry out parallelization, and the random number finally generated by each thread is spelled
Pick up and form final sequence.Single-threaded relative to CPU, the speed-up ratio under MIC platform has clear superiority.
Accompanying drawing explanation
Fig. 1 is randomizer stagewise parallelization schematic diagram;
Fig. 2 is the calculation flow chart realizing randomizer parallelization based on MIC;
Fig. 3 a and Fig. 3 b is that MRG32k3a is based on the speed-up ratio trendgram after CPU and MIC parallelization.
Detailed description of the invention
Various exemplary embodiments, feature and the aspect of the present invention is described in detail below with reference to embodiment.In order to more preferably
The explanation present invention, detailed description of the invention below gives numerous details.Those skilled in the art should manage
Solving, do not have these details, the present invention equally implements.In other example, for known method, hands
Section, material are not described in detail, in order to highlight the purport of the present invention.
As it is shown in figure 1, the present invention utilizes the method to periodic sequence segmentation to carry out parallelization.First it is the former of ρ by the cycle
Beginning random number sequence is divided equally into N section (N is maximum supported Thread Count), and every section comprisesNumber, the most each line
Journey starts recursion generation random number from the starting point of each section.Assume the total amount that random_num is the random number needing generation,
Threads_num is the Thread Count of distribution, and the most each thread produces random_num/threads_num random number.Finally will
The random number that each thread generates is stitched together and forms final sequence.
Comprise the following steps:
Step A, acquisition partiting step, including
Step A1, the cycle ρ of acquisition original random number sequence;
Step A2, obtain supported maximum thread N, every thread producible random number number
Step B, main thread read parameter, calculate each thread original state seed [id], including
Random number quantity random_num required for the reading of step B1, main thread, the Thread Count threads_num of distribution
And seed seed, wherein 0 < random_num < ρ, 0 < threads_num < N;
Step B2, set up internal memory interval result;
Step B3, calculate each thread need produce random number num=random_num/threads_num;
Step B4, the value arranging id are 1;
Step B5, calculating seed [id];
If step B6 id > threads_num, then forward step C to;
Step B7, id value, from increasing 1, are then back to step B5;
Step C, MIC thread calculate, including
Step C1, the initial value arranging i are 0, and the initial value of U is 0;
Step C2, by random number serial algorithm calculate U;
Step C3, result [id*num+i]=U;
If step C4 i > num, then terminate this thread;
Step C5, i value, from increasing 1, are then back to step C3.
In a kind of possible embodiment, the present invention comprises the following steps:
1. preparation
A) research randomizer serial algorithm, and obtain its cycle ρ;
B) cycle division rule is determined: supported Thread Count N, every thread producible random number number
2. main thread: read relevant parameter, calculate each thread original state seed [id]
A) random number quantity random_num, the Thread Count threads_num of distribution and the seed required for reading
Seed, wherein 0 < random_num < ρ, 0 < threads_num < N;
B) internal memory interval result is set up;
C) the random number num=random_num/threads_num that each thread needs to produce is calculated;
D) id=1;
E) seed [id] is calculated;
If f) id > threads_num, then turn 3;
G) id++, turns e).
3.MIC thread: control MIC equipment by modes such as native, offload, each thread synchronization works, and carries out following
Identical calculations, as in figure 2 it is shown,
A) i=0, U=0;
B) U is calculated by random number serial algorithm;
C) result [id*num+i]=U;
If d) i > num, then terminate;
E) i++, turns c).
In a kind of possible embodiment, the step of the present invention is
Control MIC equipment by native mode, use C++ to realize randomizer MRG32k3a stagewise parallelization
As a example by, MRG32k3a recurrence formula is:
Wherein n >=3,
a1,2=1403580 a1,3=-810728
a2,1=527612 a2,3=-1370589
m1=232-209 m2=232-22853
Produce [0,1) between uniform random number Un。
zn=(x1, n+x2, n)mod m1
Its cycle ρ=2192, set and at most run N=264Individual thread, each thread at most produces L=2127Individual random number.
MRG32k3a recurrence formula is converted into vector form:
XI, n+1=AiXI, nmod mi, i=1,2
Before each thread generates random number, need first to calculate the initial value of each thread, if the initial value of No. 0 thread is
s0, then the initial value of Line 1 journey is s1, its computational methods are as follows:
According to formula (xyz) mod p=(((xy) mod p) z) mod p
It is easy to get:
s1=(ALS0) mod m=(((AL)mod m)·s0)mod m
Wherein, ALMod m can be calculated by algorithm of dividing and ruling, and recurrence formula is:
ALMod m=((AL/2mod m)·(AL/2mod m))mod m
Each thread original state s can be calculated by that analogyid, wherein 0≤id < threads_num.
In sum, MRG32k3a stagewise Parallel Algorithm is as follows:
Algorithm MRG32k3a stagewise Parallel Algorithm
Input: seed seed [6], Thread Count threads_num,
Task amount random_num
Output: random_hum random number U
Begin
(1) main thread
(2) MIC equipment, native pattern
Based on above-described embodiment, test data: respectively MRG32k3a is produced under 1,2,4,8 and 16 threads of CPU platform
Raw 1,000,000,10,000,000,100,000,000,1,000,000,000 and 10,000,000,000 random number, and
Produce 1,000,000 under MIC platform 1,2,4,8,16,32,56,112,168 and 224 thread thread respectively, 10,000,000,
100,000,000, the time of 1,000,000,000 and 10,000,000,000 randoms number tests.Relative to CPU single line
Journey, the optimum speed-up ratio under MIC platform is 17.73.
Test result is as shown in Figure 3 a and Figure 3 b shows.
Wherein hardware environment is as follows:
The present invention utilizes the method to periodic sequence segmentation to carry out parallelization, is spliced by the random number that each thread generates
Form final sequence.Single-threaded relative to CPU, the speed-up ratio under MIC platform has clear superiority.
Although illustrating the present invention with reference to embodiment of above, it will be appreciated, however, that the invention is not restricted to institute
Disclosed embodiment.The scope of the appended claims should explain in broadest scope, with contain all modification,
Equivalent structure and function.
Claims (1)
1. a randomizer stagewise parallel method based on MIC, it is characterised in that comprise the following steps:
Step A, acquisition partiting step, including
Step A1, the cycle ρ of acquisition original random number sequence;
Step A2, obtain supported maximum thread N, every thread producible random number number
Step B, main thread read parameter, calculate each thread original state 8eed [id], including
Step B1, main thread read required for random number quantity random_num, the Thread Count threads_num of distribution and
Seed seed, wherein 0 < random_num < ρ, 0 < threads_num < N;
Step B2, set up internal memory interval result;
Step B3, calculate each thread need produce random number num=random_num/threads_num;
Step B4, the value arranging id are 1;
Step B5, calculating seed [id];
If step B6 id > threads_num, then forward step C to;
Step B7, id value, from increasing 1, are then back to step B5;
Step C, MIC thread calculate, including
Step C1, the initial value arranging i are 0, and the initial value of U is 0;
Step C2, by random number serial algorithm calculate U;
Step C3, result [id*num+i]=U;
If step C4 i > num, then terminate this thread;
Step C5, i value, from increasing 1, are then back to step C3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610150661.1A CN105843588A (en) | 2016-03-16 | 2016-03-16 | MIC based random number generator segmented parallelizing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610150661.1A CN105843588A (en) | 2016-03-16 | 2016-03-16 | MIC based random number generator segmented parallelizing method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105843588A true CN105843588A (en) | 2016-08-10 |
Family
ID=56588013
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610150661.1A Pending CN105843588A (en) | 2016-03-16 | 2016-03-16 | MIC based random number generator segmented parallelizing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105843588A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106648543A (en) * | 2016-12-29 | 2017-05-10 | 北京握奇智能科技有限公司 | Random number generation method and device |
CN108984152A (en) * | 2018-08-21 | 2018-12-11 | 北京睦合达信息技术股份有限公司 | A kind of data processing method, system and computer readable storage medium |
CN109521997A (en) * | 2018-11-16 | 2019-03-26 | 中国人民解放军战略支援部队信息工程大学 | The random digit generation method and device executed for shared storage multi-threaded parallel |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120105462A1 (en) * | 2010-10-28 | 2012-05-03 | Mizuho-Dl Financial Technology Co., Ltd. | Parallelization of random number generation processing by employing gpu |
CN103475469A (en) * | 2013-09-10 | 2013-12-25 | 中国科学院数据与通信保护研究教育中心 | Method and device for achieving SM2 algorithm with combination of CPU and GPU |
-
2016
- 2016-03-16 CN CN201610150661.1A patent/CN105843588A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120105462A1 (en) * | 2010-10-28 | 2012-05-03 | Mizuho-Dl Financial Technology Co., Ltd. | Parallelization of random number generation processing by employing gpu |
CN103475469A (en) * | 2013-09-10 | 2013-12-25 | 中国科学院数据与通信保护研究教育中心 | Method and device for achieving SM2 algorithm with combination of CPU and GPU |
Non-Patent Citations (4)
Title |
---|
宋博文 等: "基于MIC的MRG32k3a并行化设计与实现", 《计算机应用软件》 * |
张保东 等: "基于超多核心平台的Knuth39 并行化实现及性能分析", 《计算机应用》 * |
李智杰 等: "基于MIC的CLCG4并行化设计与实现", 《电子科技》 * |
蔡晓龙: "基于MIC的RAN2并行化设计与实现", 《赤峰学院学报》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106648543A (en) * | 2016-12-29 | 2017-05-10 | 北京握奇智能科技有限公司 | Random number generation method and device |
CN106648543B (en) * | 2016-12-29 | 2019-09-27 | 北京握奇智能科技有限公司 | A kind of random digit generation method and device |
CN108984152A (en) * | 2018-08-21 | 2018-12-11 | 北京睦合达信息技术股份有限公司 | A kind of data processing method, system and computer readable storage medium |
CN108984152B (en) * | 2018-08-21 | 2021-01-29 | 北京睦合达信息技术股份有限公司 | Data processing method, system and computer readable storage medium |
CN109521997A (en) * | 2018-11-16 | 2019-03-26 | 中国人民解放军战略支援部队信息工程大学 | The random digit generation method and device executed for shared storage multi-threaded parallel |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9099866B2 (en) | Apparatus, methods and systems for parallel power flow calculation and power system simulation | |
Komura et al. | Large-scale Monte Carlo simulation of two-dimensional classical XY model using multiple GPUs | |
CN104570081B (en) | A kind of integration method pre-stack time migration Processing Seismic Data and system | |
Peng et al. | GLU3. 0: Fast GPU-based parallel sparse LU factorization for circuit simulation | |
CN103617150A (en) | GPU (graphic processing unit) based parallel power flow calculation system and method for large-scale power system | |
CN105574809B (en) | Electromagnetic transient simulation graphics processor parallel calculating method based on matrix exponetial | |
Gibson et al. | Optimizing grouped convolutions on edge devices | |
CN105843588A (en) | MIC based random number generator segmented parallelizing method | |
Louw et al. | Using the Graphcore IPU for traditional HPC applications | |
CN102945224A (en) | High-speed variable point FFT (Fast Fourier Transform) processor based on FPGA (Field-Programmable Gate Array) and processing method of high-speed variable point FFT processor | |
Singh et al. | Corrections to Pauling residual entropy and single tetrahedron based approximations for the pyrochlore lattice Ising antiferromagnet | |
Scarf | Testing for optimality in the absence of convexity | |
CN103064819A (en) | Method for utilizing microwave integrated circuit (MIC) to rapidly achieve lattice Boltzmann parallel acceleration | |
Chang et al. | SDCNN: An efficient sparse deconvolutional neural network accelerator on FPGA | |
Cádenas-Montes et al. | Accelerating particle swarm algorithm with GPGPU | |
CN106569543A (en) | Two-channel signal generator and output waveform synchronization method thereof | |
Jalili-Marandi et al. | Large-scale transient stability simulation on graphics processing units | |
US20160259871A1 (en) | Model generation method and information processing apparatus | |
CN104156268B (en) | The load distribution of MapReduce and thread structure optimization method on a kind of GPU | |
CN101478159B (en) | Transient stabilized constraint tide optimization process | |
CN103823969B (en) | A kind of visualization construction method of Power System Steady-state model | |
CN105843587A (en) | MIC-based salutatory parallelization method of random number generator | |
Cui et al. | A multi-core high performance computing framework for probabilistic solutions of distribution systems | |
Zhao et al. | GPU based parallel matrix exponential algorithm for large scale power system electromagnetic transient simulation | |
CN108304633A (en) | Hydraulic Transient method for numerical simulation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20160810 |