CN106020949B - A kind of fast parallel calculation method of optimal extension field element multiplication - Google Patents
A kind of fast parallel calculation method of optimal extension field element multiplication Download PDFInfo
- Publication number
- CN106020949B CN106020949B CN201610305021.3A CN201610305021A CN106020949B CN 106020949 B CN106020949 B CN 106020949B CN 201610305021 A CN201610305021 A CN 201610305021A CN 106020949 B CN106020949 B CN 106020949B
- Authority
- CN
- China
- Prior art keywords
- multiplication
- renderscript
- main body
- field element
- body function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/542—Event management; Broadcasting; Multicasting; Notifications
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
The present invention provides a kind of fast parallel calculation methods of optimal extension field element multiplication comprising: the first step is the optimal extension field element arithmetic unit java class of the multiplying design specialized of optimal extension field element;Second step, the multiplication calculating main body function and drop time designed in RenderScript calculates main body function, the two functions are by java class object by the calculating core of RenderScript enforcement engine concurrent invocation, and when defined function is directed to the first address of single internal storage location or a collection of internal storage location with homogeneity characteristic.This method realizes quick parallel polynomial modular multiplication using RenderScript programming interface and parallel processing mechanism in Android platform.
Description
Technical field
The present invention relates to field of computer technology more particularly to a kind of optimal expansions based on Renderscript programming framework
Domain multiplication calculation method.
Background technique
Finite field Fp mReferred to as an optimal extension field OEF (optimal extension field), if p=2n- c (its
Middle P is prime number, and for range within 2^64, Integer n and c meet log2 | c |≤n/2), and there are an irreducible function f
(z)=zm-ω.Element in the finite field is the multinomial that most high order is no more than m-1, and multinomial coefficient is FpIn element.Most
The adding, subtract, multiplying of element in excellent expansion domain, square, the basic operations such as invert be mould f (z) multinomial operation, the fortune of multinomial coefficient
At last in FpDefined in, i.e. the arithmetical operation of mould p.
When c is 1 or -1, which is referred to as I type;If ω=2, optimal extension field is referred to as II type
's.It is some common optimal extension field parameters shown in table 1.
Table 1.OEF parameter citing
p | f | parameters | Type |
27+3 | z13-5 | N=7, c=-3, m=13, ω=5 | - |
213-1 | z13-2 | N=13, c=1, m=13, ω=2 | I, II |
257-13 | z3-2 | N=57, c=13, m=3, ω=2 | II |
231-19 | z6-2 | N=31, c=19, m=6, ω=2 | II |
257-13 | z3-2 | N=57, c=13, m=3, ω=2 | II |
261-1 | z3-37 | N=61, c=1, m=3, ω=37 | I |
231-1 | z6-7 | N=31, c=1, m=6, ω=7 | I |
Optimal extension field can be used to construct elliptic curve, it is advantageous that: the size of prime number p can be chosen at no more than one
Within the scope of the word length of a word, the safety of cryptographic system is improved by the collective effect of m and p.So, in evaluator
The problems such as can simplify or avoid big integer arithmetic carry when coefficient operation, to improve software realization efficiency.
In the basic operation of optimal extension field element, the complexity of polynomial multiplication is higher, while multiplication is the fortune such as invert again
The basis of calculation.
Darrel Hankerson etc. is provided in " Guide to Elliptic Curve Cryptography " book
The principle description of optimal extension field element multiplication.Optimal extension field element a is multiplied with b can be used the common multinomial of coefficient module p
Formula multiplication modulo irreducible function f (z) is realized.Formula are as follows:
Wherein
This method only gives principle description, without reference to details is realized, does not account for co-efficient multiplication and modular arithmetic
The problems such as spilling in journey, carry.
Karatsuba-Ofman method uses (divide-and-conquer) algorithm idea, it is possible to reduce co-efficient multiplication
The number of operation.By way of multinomial a respectively being resolved into two multinomials with b and being added, reduced using multiplication apportionment ratio
Multiplication number.Such as:
A (z) b (z)=(A1zl+A0)(B1zl+B0)
=A1B1z2l+[(A1+A0)(B1+B0)-A1B1-A0B0]zl+A0B0
WhereinA0, A1, B0, B1It is all the multinomial that number is no more than l.The process can be carried out with recurrence,
I.e. by A0, A1, B0, B1It decomposes again.In Karatsuba-Ofman method, multinomial is successively repeatedly decomposed, when executing calculating
It needs to rely on the calculated result of subsequent decomposition, can not accomplish that the data of each calculating process are uncorrelated, therefore can not be at parallelization
Reason, cannot embody multi-core parallel computation advantage.In addition, multinomial cutting procedure realizes complex, the direct shadow of dividing method
Execution efficiency is rung, for lesser m value, it is larger to load (overhead).
The calculating process of optimal extension field element multiplication, which can be understood as two multinomials, to carry out common multinomial first and multiplies
Then method executes mould f (z) operation.Since f (z) has f (z)=zm-- the special shape of ω can be incited somebody to action when calculating mould f (z)
All zmIt is replaced with ω, it thus can be to avoid polynomial division, to accomplish to drop item by item secondary.
Such as: in irreducible function f (z)=z5- 2 and p=31 is to calculate two members of a and b in the optimal extension field of parameter
The multiplication of element, in which:
A=z4+5z2+3z+7
B=9z2+z
Its ordinary polynomials multiplication result is c=9z6+z5+14z4+z3+7z2(coefficient module p), drop time is by institute to+7z item by item
Some z5Replace with 2, i.e. c=9z × z5+z5+14z4+z3+7z2+ 7z=18z+2+14z4+z3+7z2+ 7z=14z4+z3+7z2+
25z+2
On the elliptic curve of optimal extension field construction, the realization of primary times of point or add operation needs to call up to tens
Secondary optimal extension field element operation.Therefore, that improves the execution speed of optimal extension field operation, especially multiplying executes speed
Degree, the execution efficiency for improving elliptic curve cryptosystem have very important significance, and find quickly and effectively optimal expansion
Field element calculation method is very necessary.
It is largely based on serial algorithm in current optimal extension field calculation method.This is used in Android platform
When a little methods, what is utilized is the computing resource of CPU.On PC platform used parallel computing CUDA frame can not
It is used in android system.
RenderScript is the programming frame for running 3D rendering in a set of Android platform and handling intensive calculating task
Frame, what is be mainly directed towards is the calculating task with parallel data processing feature.The operating mechanism of RenderScript is can to incite somebody to action
Calculating task parallelization, assigns them to all available processor units in mobile device, for example, multicore CPU, GPU or
DSP.When developer is developed using RenderScript programming framework, the framework difference of target device can also be ignored, because
RenderScript code uses compiling at runtime and caching technology, can find automatically and using all kinds of places on target device
Manage device resource.If the target device of operation program does not have any GPU or DSP, RenderScript engine can appoint calculating
Business transfers to CPU to complete completely, and therefore, RenderScript programming framework has high device independence and portability.
RenderScript can significantly improve the application journey of image processing class, computer vision class and high-performance calculation class
The speed of service of sequence increases the executive capability and computing capability of Android native language.RenderScript is marked using c99
Standard is the programming framework of a type C grammer.Since Android4.3 version, RenderScript becomes in android system
Unique parallel computation programming framework.
Following steps can be summarized as using the conventional method of RenderScript programming framework in Android platform:
Firstly, creating the calculating core document of the entitled rs of suffix in Android engineering created, and it is stored in
Under the src catalogue of engineering, it is fixed that pragma statement, corresponding java class declaration and main body calculating function are contained in this document
Justice.It is by the important means of calculating task parallelization that main body, which calculates function, and in Android application program, main body calculates function
By Android apply in upper layer java class object called in the concurrent mode of more examples.Each concurrent function example is independent
Execution calculating task, it will usually access the internal storage location being isolated from each other.
Second step be under same catalogue creation be used to calling main body calculate function upper layer java class, so as to in the first step
Rs file establish connection.
Finally, creating a RenderScript class object in Android application program, and then use the Object Creation the
Java class object in upper layer described in two steps distributes it resource and initializes.By creating and using Allocation class pair
As data are swapped and are replicated between java program internal memory space and RenderScript engine memory headroom.
RenderScript class and Allocation class are all classes preset in Android development platform, as long as making in java program
Importing relevant packet i.e. with import order can be used.
The present invention designs a kind of new storage organization and calculating by improving to the existing serial algorithm of optimal extension field
Method is realized quickly parallel multinomial using the RenderScript programming interface and parallel processing mechanism in Android platform
Formula modular multiplication.
Summary of the invention
The technical problem to be solved by the present invention is to be improved to the existing serial algorithm of optimal extension field, design a kind of new
Storage organization and calculation method, utilize the RenderScript programming interface and parallel processing mechanism in Android platform, it is real
Now quick parallel polynomial modular multiplication.
In order to solve the above technical problems, the present invention provides a kind of fast parallel calculating sides of optimal extension field element multiplication
Method comprising:
The first step is the optimal extension field element arithmetic unit java class of the multiplying design specialized of optimal extension field element;
Second step, the multiplication calculating main body function and drop time designed in RenderScript calculate main body function, the two
Function is by java class object by the calculating core of RenderScript enforcement engine concurrent invocation, what when defined function was directed to
It is the first address of single internal storage location or a collection of internal storage location with homogeneity characteristic.
The first step further comprises:
A step, is the java class of optimal extension field element operation design specialized, defines three class members's one-dimension array variables,
Two of them are used to store two multinomials for participating in operation (multiplication), another storage (multiplication) calculated result;
B step is optimal extension field element arithmetic unit class constructing definitions function, and each member is initialized in constructed fuction and is become
Amount;
C step, defines polynomial multiplication method for optimal extension field element arithmetic unit class, programs frame using Renderscript
The memory management interfaces bind that frame provides, passes to RenderScript for the corresponding Allocation class object of two multipliers and holds
Row engine is stored in two array type variables of RenderScript enforcement engine memory headroom.
The second step further specifically includes:
A ' step, the multiplication designed in RenderScript calculate main body function, define two parameters: one be using
One element of array of Allocation object storage;Another parameter is the offset of parallel calling, by RenderScript
Enforcement engine automatic assignment, for each called main body function example, the value of offset parameter is different, main
Body function determines position of the unit in result array according to the value of offset, so that it is determined that the calculation method of the cell value;
B ' step, the drop time designed in RenderScript calculate main body function, define two parameters: one be using
One element of array of Allocation object storage;Another parameter is the offset of parallel calling, by RenderScript
Enforcement engine automatic assignment;
C ' step, design multiplication calculate the mould p multiplication mechanism that can be avoided Overflow handling in main body function.
A step is further specially to define three Allocation class objects to be used as class members's variable, be used for
Renderscrip engine transmits data, defines three class members's lint-long integer variables, and optimal storage expands field parameter m, ω and p.
The b step is further specially the value according to parameter m, and equal-sized two Allocation classes are respectively created
Object, creation are used to store the third Allocation class object of calculated result.
The c step is further specially that the multiplication in RenderScript is called to calculate the concurrent operation of main body function, so
It calls the drop time in RenderScript to calculate main body function afterwards and completes the operation of mould irreducible function.Finally, using
The memory management interfaces copyTo that Renderscript programming framework provides will be stored in third Allocation class object
Calculated result is passed back in the member variable of java class.
In a ' step, multiplication is calculated in main body function, is determined according to offset and is read from two multiplier multinomials
Take corresponding term coefficient, carry out mould p multiplication be added to obtain cell value result.
In the b ' step, in the implementation procedure that drop time calculates main body function, which is determined according to the value of offset
Position of the member in result array, and then judge that its respective items number whether more than m, is no more than m for those respective items numbers
Item find the number element in array corresponding with oneself identical item after drop time, be stored in this unit after being added with it.
In c ' step, when calculating two number s and r mould p multiplication, multiplier s is expressed as binary form;Use one
Accumulator variable t and be arranged its initial value be 0;Right-to-left traverses the binary string of s by turn, and one judgement of every access once should
Whether position is 1, and r value adds up into t if 1, and t mould p is saved, and then sets r+r mould p for r value;It is completed wait traverse
Afterwards, stored in t be s be multiplied with r mould p's as a result, if machine word-length be w, as long as the value of p is not more thanw-1, above-mentioned
Process would not generate overflow problem, i.e., the end value of all operations does not all exceed the expression range an of word length.
Beneficial effects of the present invention:
Based on a kind of optimal extension field element multiplying parallel by renderscript programming framework provided by the invention
Calculation method meets: (1) fast implementing optimal extension field element multiplying;It (2) can be real in any android system equipment
It is existing.
Detailed description of the invention
Fig. 1 multiplication concurrent operation data cell schematic diagram;
Fig. 2 drop time calculates main body function data cell schematics.
Specific embodiment
The present invention provides a kind of fast parallel calculation methods of optimal extension field element multiplication comprising:
The first step is the optimal extension field element arithmetic unit java class of the multiplying design specialized of optimal extension field element;
Second step, the multiplication calculating main body function and drop time designed in RenderScript calculate main body function, the two
Function is by java class object by the calculating core of RenderScript enforcement engine concurrent invocation, what when defined function was directed to
It is the first address of single internal storage location or a collection of internal storage location with homogeneity characteristic.
The first step further comprises:
A step, is the java class of optimal extension field element operation design specialized, defines three class members's one-dimension array variables,
Two of them are used to store two multinomials for participating in operation (multiplication), another storage (multiplication) calculated result;
B step is optimal extension field element arithmetic unit class constructing definitions function, and each member is initialized in constructed fuction and is become
Amount;
C step, defines polynomial multiplication method for optimal extension field element arithmetic unit class, programs frame using Renderscript
The memory management interfaces bind that frame provides, passes to RenderScript for the corresponding Allocation class object of two multipliers and holds
Row engine is stored in two array type variables of RenderScript enforcement engine memory headroom;
A ' step, the multiplication designed in RenderScript calculate main body function, define two parameters: one be using
One element of array of Allocation object storage;Another parameter is the offset of parallel calling, by RenderScript
Enforcement engine automatic assignment, for each called main body function example, the value of offset parameter is different, main
Body function determines position of the unit in result array according to the value of offset, so that it is determined that the calculation method of the cell value;
B ' step, the drop time designed in RenderScript calculate main body function, define two parameters: one be using
One element of array of Allocation object storage;Another parameter is the offset of parallel calling, by RenderScript
Enforcement engine automatic assignment;
C ' step, design multiplication calculate the mould p multiplication mechanism that can be avoided Overflow handling in main body function.
A step is further specially to define three Allocation class objects to be used as class members's variable, be used for
Renderscrip engine transmits data, defines three class members's lint-long integer variables, and optimal storage expands field parameter m, ω and p.
The b step is further specially the value according to parameter m, and equal-sized two Allocation classes are respectively created
Object, creation are used to store the third Allocation class object of calculated result.
The c step is further specially that the multiplication in RenderScript is called to calculate the concurrent operation of main body function, so
It calls the drop time in RenderScript to calculate main body function afterwards and completes the operation of mould irreducible function.Finally, using
The memory management interfaces copyTo that Renderscript programming framework provides will be stored in third Allocation class object
Calculated result is passed back in the member variable of java class.
In a ' step, multiplication is calculated in main body function, is determined according to offset and is read from two multiplier multinomials
Take corresponding term coefficient, carry out mould p multiplication be added to obtain cell value result.
In the b ' step, in the implementation procedure that drop time calculates main body function, which is determined according to the value of offset
Position of the member in result array, and then judge that its respective items number whether more than m, is no more than m for those respective items numbers
Item find the number element in array corresponding with oneself identical item after drop time, be stored in this unit after being added with it.
Such as: for multiplication result multinomial c=9z above6+z5+14z4+z3+7z2+ 7z corresponds to
The value of each element is respectively { 9,1,14,1,7,7,0 }, each element quilt in the coefficient array of Allocation object storage
One drop time calculates main body function and individually handles, since the parameter of its optimal extension field is f (z)=zb- 2 and p=31, therefore only
14,1,7,7,0 these elements are handled by function.If some drop time calculates, main body function is called and offset parameter is 1,
The function instance processes be exactly 7z this, according to above-mentioned rule, which can store from Allocation object
9z is found in coefficient array6This coefficient value 9, and be added with its mould p, it writes the result into currentElement.Obtain 16z's
Coefficient 16.
In c ' step, the mould p multiplying for needing frequently to use multinomial coefficient in main body function is calculated in multiplication, to the greatest extent
Pipe can be selected within the scope of the word length no more than a word (such as 64) when choosing p value, but two FpIn
The product of element is likely to the word length range beyond a word.Therefore it is not available simple multiplying and realizes multinomial coefficient
Multiplication operation.The present invention solves the problems, such as this with the following method: when calculating two number s and r mould p multiplication, multiplier s being indicated
For binary form;Using an accumulator variable t and be arranged its initial value be 0;Right-to-left traverses the binary system of s by turn
String, every access one judges whether the primary position is 1, and r value adds up into t if 1, and t mould p is saved, and then sets r value
It is set to r+r mould p;After the completion of traversing, what is stored in t is that s is multiplied the result of mould p with r.If machine word-length is w, as long as
The value of p is not more than 2w-1, and the above process would not generate overflow problem, i.e., the end value of all operations does not all exceed one
The expression range of word length.
Present invention offer can provide new operation method to the optimal extension field element multiplication in Android platform, lead to
It crosses parallel method and significantly improves calculated performance.
Intermediate result carry overflow problem is avoided during processing system digital-to-analogue p multiplication, was calculated to simplify
Journey has saved the calculating time.
The optimal extension field multiplying that the present invention and Android platform are realized using serial approach compares, test
It was found that parallel method used in the present invention is with the obvious advantage.
The present invention will be described in detail below with reference to the drawings of preferred embodiments, whereby to the present invention how applied technology method
Technical problem is solved, and the realization process for reaching technical effect can fully understand and implement.
The fast parallel calculation method of a kind of optimal extension field element multiplication provided by the invention, Step 1: being optimal extension field
The optimal extension field element arithmetic unit java class of the multiplying design specialized of element.Class is defined in file and is ordered by using import
It enables and imports android.renderscript.Allocation, android.renderscript.Element and android.
Tri- program bags of renderscript.RenderScript, to use built-in RenderScript object.Arithmetic unit Java
The definition detailed process of class are as follows:
1.1) member variable number group a and b are defined in optimal extension field element arithmetic unit java class, participate in multiplication fortune for storing
The two optimal extension field elements calculated define member's variable number group c and are used to store multiplication result.Each element of three arrays
It is all the integer that length is 64 bits.In addition the member variable of three integer types is defined, for recording above-mentioned array space hold
Situation, i.e., used subscript maximum value.Three class members's lint-long integer variables are defined, optimal storage expands field parameter m, ω and p.
For an optimal extension field element, multinomial coefficient is stored in array according to the sequence from low order to high order.Such as:
Multinomial 9x6+5x4Storage mode of+the 6x+3 in array is a []={ 9,0,5,0,0,6,3 }
The member variable that 3 Allocation object types are defined in optimal extension field element arithmetic unit java class, is used respectively
To transmit multiplier and operation result data to RenderScript computing engines.In the creation of arithmetic unit class object, need to specify
Optimal extension field parameter m, ω and p.
1.2) optimal extension field element multiplication calculation method is defined: the memory pipe provided using Renderscript programming framework
Interface bind is managed, the content of the corresponding Allocation class object of two multipliers is assigned to Renderscript and calculates main body letter
Two aray variables in number memory headroom.
Master is calculated using the calling multiplication of the forEach_functionName interface concurrent of ScriptC_mono class object
Body function, functionName are the titles that multiplication calculates main body function.Using third Allocation class object as parameter
It is passed to the calling process, so that the independent unit of account as each concurrent function example of each element therein.Operation is complete
Cheng Hou, what is stored in third Allocation class object is exactly the multiplication result of ordinary polynomials multiplication.
Using the calling of the forEach_functionName interface concurrent of ScriptC_mono class object, time calculating master drops
Body function, functionName are the titles that drop time calculates main body function.Still using third Allocation class object as
Parameter is passed to the calling process, so that the independent unit of account as each concurrent function example of each element therein.Fortune
After the completion of calculation, what is stored in the preceding m element of third Allocation class object array is exactly mould f (z) multiplication calculation result.
After the completion of operation, result is stored in member variable array c and is returned.
The wherein calling main body function by the way of forEach_functionName, can allow multiple main body function examples
By concurrent calling, to achieve the purpose that boosting algorithm execution efficiency.
Step 2: the multiplication in design RenderScript calculates main body function and drop time calculates main body function, the two
Function is by java class object by the calculating core of RenderScript enforcement engine concurrent invocation, what when defined function was directed to
It is the first address of single internal storage location or a collection of internal storage location with homogeneity characteristic.
2.1) multiplication calculates main body function
Multiplication in RenderScript calculates there are two the parameters of main body function, and one is using Allocation object
An element in array;Another parameter is the offset of parallel calling, i.e. the array element of first parameter instruction is whole
Offset in a array.Since the main body function is first by RenderScript enforcement engine concurrent invocation
Parameter need by Allocation object transmit memory obtain, and second parameter be in concurrent invocation by
RenderScript enforcement engine automatic assignment.
The specific execution step of multiplication calculating main body function are as follows: be directed to offset x, found from two multiplier arrays all
The A [i] of these element centerings is distinguished after mould p is multiplied again < A [i], B [j] > by element of the sum of the subscript equal to x with B [j]
Mould p is added, and is as a result stored in currentElement, that is, C [x].Attached drawing 1 is multiplication concurrent operation data cell schematic diagram of the invention.
The wherein implementation procedure of mould p multiplication operation are as follows:
A [i] * B [j] modp is calculated, B [j] is expressed as binary form (bw-1, bw-2..., b2, b1, b0), wherein w is
Word length.Spilling in multiplication process in order to prevent, we select p to make p no more than 2w-1.Initialize an accumulator variable t simultaneously
It is 0 that its initial value, which is arranged,;If the lowest order of B [j] is 1, A [i] is added with t mould p, and result is saved in t.By A
[i] is stored in A [i] after being added with itself mould p, is repeated the above process until B [j] is equal to zero after B [j] is moved to right one.
2.2) drop time calculates main body function
There are two the parameters of the calculating main body function in RenderScript, and one is deposited using Allocation object
The multiplication of storage calculates an element in C mouthfuls of main body Function Array, another parameter is the offset of parallel calling, by
RenderScript enforcement engine automatic assignment.If offset value x be less than optimal extension field parameter m, by C [X+m] element value with
Optimal extension field parameter ω mould p is multiplied, and then carries out mould p with the value of currentElement C [x] and is added, is as a result stored in C [x].
Attached drawing 2 is the data cell schematic diagram that drop time calculates that main body function executes calculating.
Drop time calculates main body function and multiplication calculates main body function and all carries out operation, and its just for a data cell
The calculating process of calculating process and other any data cells is completely irrelevant, and scheduling mode is that multiple function examples are adjusted parallel
Degree.The function example of multiple concurrent schedulings is all accessed at accessing shared data region with read-only mode, does not cause any conflict
Or the inconsistent situation of data.
The optimal extension field multiplying that the present invention and Android platform are realized using serial approach compares, test
It was found that parallel method used in the present invention is with the obvious advantage.Different brands handpiece portion measured data is as shown in table 2 and table 3:
The test data of the different mobile phones of table 2
3 mobile phone configuration situation of table
All above-mentioned this intellectual properties of primarily implementation, there is no this new products of implementation of setting limitation other forms
And/or new method.Those skilled in the art will utilize this important information, above content modification, to realize similar execution feelings
Condition.But all modifications or transformation belong to the right of reservation based on new product of the present invention.
The above described is only a preferred embodiment of the present invention, being not that the invention has other forms of limitations, appoint
What those skilled in the art changed or be modified as possibly also with the technology contents of the disclosure above equivalent variations etc.
Imitate embodiment.But without departing from the technical solutions of the present invention, according to the technical essence of the invention to above embodiments institute
Any simple modification, equivalent variations and the remodeling made, still fall within the protection scope of technical solution of the present invention.
Claims (7)
1. a kind of fast parallel calculation method of optimal extension field element multiplication characterized by comprising
The first step is the optimal extension field element arithmetic unit java class of the multiplying design specialized of optimal extension field element;
Second step, the multiplication calculating main body function and drop time designed in RenderScript calculate main body function, the two functions
It is by java class object by the calculating core of RenderScript enforcement engine concurrent invocation, when defined function is directed to list
The first address of a internal storage location or a collection of internal storage location with homogeneity characteristic;
The first step further comprises:
A step, is the java class of optimal extension field element operation design specialized, defines three class members's one-dimension array variables, wherein
Two participate in two multinomials of multiplication for storing, another storage multiplication calculation result;
B step, is optimal extension field element arithmetic unit class constructing definitions function, each member variable is initialized in constructed fuction;
C step is defined polynomial multiplication method for optimal extension field element arithmetic unit class, is mentioned using Renderscript programming framework
The corresponding Allocation class object of two multipliers is passed to RenderScript execution and drawn by the memory management interfaces bind of confession
It holds up, is stored in two array type variables of RenderScript enforcement engine memory headroom;
The second step further comprises:
A ' step, the multiplication designed in RenderScript calculate main body function, define two parameters: one be using
One element of array of Allocation object storage;Another parameter is the offset of parallel calling, by RenderScript
Enforcement engine automatic assignment, for each called main body function example, the value of offset parameter is different, main
Body function determines position of the unit in result array according to the value of offset, so that it is determined that the calculation method of the cell value;
B ' step, the drop time designed in RenderScript calculate main body function, define two parameters: one be using
One element of array of Allocation object storage;Another parameter is the offset of parallel calling, by RenderScript
Enforcement engine automatic assignment;
C ' step, design multiplication calculate the mould p multiplication mechanism that can be avoided Overflow handling in main body function.
2. the excellent fast parallel calculation method for expanding field element multiplication as described in claim 1, it is characterised in that: a step
It is further specially to define three Allocation class objects as class members's variable, for being transmitted with Renderscrip engine
Data, define three class members's lint-long integer variables, and optimal storage expands field parameter m, ω and p.
3. the excellent fast parallel calculation method for expanding field element multiplication as claimed in claim 2, it is characterised in that: the b step
Further it is specially the value according to parameter m, equal-sized two Allocation class objects is respectively created, creation is used to store
The third Allocation class object of calculated result.
4. the excellent fast parallel calculation method for expanding field element multiplication as described in claim 1, it is characterised in that: the c step
Further it is specially that the multiplication in RenderScript is called to calculate the concurrent operation of main body function, then calls RenderScript
In drop time calculate main body function and complete the operation of mould irreducible function, finally, being provided using Renderscript programming framework
Memory management interfaces copyTo the calculated result stored in third Allocation class object is passed back to the member of java class
In variable.
5. the excellent fast parallel calculation method for expanding field element multiplication as described in claim 1, it is characterised in that: in a '
In step, multiplication is calculated in main body function, is determined according to offset and is read corresponding term coefficient from two multiplier multinomials, carried out
Mould p is multiplied and is added to obtain cell value result.
6. the excellent fast parallel calculation method for expanding field element multiplication as claimed in claim 2, it is characterised in that: in the b '
In step, in the implementation procedure that drop time calculates main body function, position of the unit in result array is determined according to the value of offset
It sets, and then judges its respective items number whether more than m, for after item searching drop time of those respective items numbers no more than m times
The number element in array corresponding with oneself identical item, is stored in this unit after being added with it.
7. the excellent fast parallel calculation method for expanding field element multiplication as described in claim 1, it is characterised in that: in c ' step
In, when calculating two number s and r mould p multiplication, multiplier s is expressed as binary form;Using an accumulator variable t and set
Setting its initial value is 0;Right-to-left traverses the binary string of s by turn, and every access one judges whether the primary position is 1, if 1
Then r value is added up into t, and t mould p is saved, then sets r+r mould p for r value;After the completion of traversing, that store in t is s
Be multiplied mould p's as a result, if machine word-length is w, as long as the value of p, no more than 2w-1, the above process would not generate excessive with r
It goes wrong, i.e., the end value of all operations does not all exceed the expression range an of word length.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610305021.3A CN106020949B (en) | 2016-05-09 | 2016-05-09 | A kind of fast parallel calculation method of optimal extension field element multiplication |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610305021.3A CN106020949B (en) | 2016-05-09 | 2016-05-09 | A kind of fast parallel calculation method of optimal extension field element multiplication |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106020949A CN106020949A (en) | 2016-10-12 |
CN106020949B true CN106020949B (en) | 2019-08-06 |
Family
ID=57098983
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610305021.3A Expired - Fee Related CN106020949B (en) | 2016-05-09 | 2016-05-09 | A kind of fast parallel calculation method of optimal extension field element multiplication |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106020949B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101271570A (en) * | 2008-05-07 | 2008-09-24 | 威盛电子股份有限公司 | Apparatus and method for large integer multiplication operation |
CN103731254A (en) * | 2012-10-14 | 2014-04-16 | 张仁平 | Correcting and applying system of fast algorithm library for number theory (NTL) |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9244683B2 (en) * | 2013-02-26 | 2016-01-26 | Nvidia Corporation | System, method, and computer program product for implementing large integer operations on a graphics processing unit |
-
2016
- 2016-05-09 CN CN201610305021.3A patent/CN106020949B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101271570A (en) * | 2008-05-07 | 2008-09-24 | 威盛电子股份有限公司 | Apparatus and method for large integer multiplication operation |
CN103731254A (en) * | 2012-10-14 | 2014-04-16 | 张仁平 | Correcting and applying system of fast algorithm library for number theory (NTL) |
Non-Patent Citations (2)
Title |
---|
A Study on Android-Based Real Number Field Elliptic Curve Key Table Generation;Eun-hee Goo, 等;《Computer Application for Security, Control and System Engineering》;20121202;第176-181页 * |
Fast Arithmetic on ATmega128 for Elliptic Curve Cryptography;Anton Kargl,等;《Cryptology ePrint Archive: Report 2008/442》;20081021;第1-15页 * |
Also Published As
Publication number | Publication date |
---|---|
CN106020949A (en) | 2016-10-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kumar et al. | Scalable load balancing techniques for parallel computers | |
George et al. | Sparse Cholesky factorization on a local-memory multiprocessor | |
CN112559163B (en) | Method and device for optimizing tensor calculation performance | |
JP3047998B2 (en) | Processor allocation method and apparatus in parallel computer | |
Zayer et al. | A gpu‐adapted structure for unstructured grids | |
CN105335135B (en) | Data processing method and central node | |
Simhadri | Program-centric cost models for locality and parallelism | |
CN108595149A (en) | Restructural multiply-add operation device | |
CN106020949B (en) | A kind of fast parallel calculation method of optimal extension field element multiplication | |
Duff et al. | Experiments with sparse Cholesky using a sequential task-flow implementation | |
Peterson et al. | Automatic halo management for the Uintah GPU-heterogeneous asynchronous many-task runtime | |
Krasnopolsky | Revisiting performance of BiCGStab methods for solving systems with multiple right-hand sides | |
Heuveline et al. | Parallel smoothers for matrix-based geometric multigrid methods on locally refined meshes using multicore CPUs and GPUs | |
CN109783141A (en) | Isomery dispatching method | |
Dun et al. | Towards efficient canonical polyadic decomposition on sunway many-core processor | |
Rossignon et al. | A NUMA-aware fine grain parallelization framework for multi-core architecture | |
He et al. | Algorithms for tree-shaped task partition and allocation on heterogeneous multiprocessors | |
Kim | Scheduling and allocation problems in high-level synthesis | |
CN112306675A (en) | Data processing method, related device and computer readable storage medium | |
Jeannot | Process mapping on any topology with TopoMatch | |
Chenhan et al. | Performance models and workload distribution algorithms for optimizing a hybrid CPU–GPU multifrontal solver | |
Gebremedhin et al. | Graph coloring on coarse grained multicomputers | |
Jo et al. | Task assignment in homogeneous linear array networks | |
Duff et al. | NLAFET Working Note 7 | |
AlQuwaiee | On Performance Optimization and Prediction of Parallel Computing Frameworks in Big Data Systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190806 Termination date: 20210509 |