CN106020949A - Fast parallel calculation method of optimal extension field element multiplication - Google Patents

Fast parallel calculation method of optimal extension field element multiplication Download PDF

Info

Publication number
CN106020949A
CN106020949A CN201610305021.3A CN201610305021A CN106020949A CN 106020949 A CN106020949 A CN 106020949A CN 201610305021 A CN201610305021 A CN 201610305021A CN 106020949 A CN106020949 A CN 106020949A
Authority
CN
China
Prior art keywords
multiplication
renderscript
main body
field element
extension field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610305021.3A
Other languages
Chinese (zh)
Other versions
CN106020949B (en
Inventor
咸鹤群
程相国
张曙光
张曼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao University
Original Assignee
Qingdao University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao University filed Critical Qingdao University
Priority to CN201610305021.3A priority Critical patent/CN106020949B/en
Publication of CN106020949A publication Critical patent/CN106020949A/en
Application granted granted Critical
Publication of CN106020949B publication Critical patent/CN106020949B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention provides a fast parallel calculation method of optimal extension field element multiplication. The method comprises the following steps: firstly, designing a specialized optimal extension field element arithmetic unit Java class for multiplication of optimal extension field elements; and secondly, designing a multiplication main function and a descending main function in RenderScript, wherein the two functions are calculation cores which are concurrently called by a Java class object through a RenderScript execution engine, and when the functions are defined, the first address of a single internal memory unit or the first addresses of a batch of internal memory units with homogeneous features is or are aimed at. According to the method provided by the invention, a RenderScript programming interface and a parallel processing mechanism in an Android platform are utilized, so that the fast parallel polynomial module multiplication is realized.

Description

A kind of fast parallel computational methods of optimal extension field element multiplication
Technical field
The present invention relates to field of computer technology, particularly relate to a kind of optimum expansion based on Renderscript programming framework Territory multiplication computational methods.
Background technology
Finite field Fp mIt is referred to as an optimal extension field OEF (optimal extension field), if p=2n-c (its Middle P is prime number, and scope is within 2^64, and Integer n and c meet log2 | c |≤n/2), and there is irreducible function f (z)=zm-ω.Element in this finite field is the multinomial that high order is less than m-1, and multinomial coefficient is FpIn element.? The adding of element in excellent expansion territory, subtract, take advantage of, square, the elementary operation such as invert be the multinomial operation of mould f (z), the fortune of multinomial coefficient At last at FpDefined in, i.e. the arithmetical operation of mould p.
When c is 1 or-1, this optimal extension field is referred to as I type;If ω=2, then optimal extension field is referred to as II type 's.Shown in table 1 is some conventional optimal extension field parameters.
Table 1.OEF parameter is illustrated.
p f parameters Type
27+3 z13-5 N=7, c=-3, m=13, ω=5 -
213-1 z13-2 N=13, c=1, m=13, ω=2 I, II
257-13 z3-2 N=57, c=13, m=3, ω=2 II
231-19 z6-2 N=31, c=19, m=6, ω=2 II
257-13 z3-2 N=57, c=13, m=3, ω=2 II
261-1 z3-37 N=61, c=1, m=3, ω=37 I
231-1 z6-7 N=31, c=1, m=6, ω=7 I
Optimal extension field can be used to construct elliptic curve, it is advantageous that: the size of prime number p can be chosen at less than one In the range of the word length of individual word, improved the safety of cryptographic system by the common effect of m and p.So, in evaluator Can simplify or avoid the problems such as big integer arithmetic carry during coefficient computing, thus improve software and realize efficiency.
In the elementary operation of optimal extension field element, the complexity of polynomial multiplication is higher, and multiplication is again the fortune such as invert simultaneously The basis calculated.
Darrel Hankerson etc. is given in " Guide to Elliptic Curve Cryptography " book The principle of optimal extension field element multiplication describes.Optimal extension field element a with b is multiplied can the most multinomial with coefficient of utilization mould p Formula multiplication modulo irreducible function f (z) realizes.Formula is:
c ( z ) = a ( z ) b ( z ) = ( Σ i = 0 m - 1 a i z i ) ( Σ j = 0 m - 1 b j z j ) ≡ Σ k = 0 2 m - 2 c k z k ≡ c m - 1 z m - 1 + Σ k = 0 m - 2 ( c k + ωc k + m ) z k ( mod f ( z ) )
Wherein
c k = Σ i + j = k a i b j mod p .
The method only gives principle and describes, and without reference to realizing details, does not accounts for co-efficient multiplication and modular arithmetic The problems such as spilling in journey, carry.
Karatsuba-Ofman method uses (divide-and-conquer) algorithm idea, it is possible to reduce co-efficient multiplication The number of times of computing.By multinomial a and b is each resolved into the form that two multinomials are added, multiplication apportionment ratio is utilized to reduce Multiplication number of times.Such as:
A (z) b (z)=(A1zl+A0)(B1zl+B0)
=A1B1z2l+[(A1+A0)(B1+B0)-A1B1-A0B0]zl+A0B0
WhereinA0, A1, B0, B1It it is all the number of times multinomial that is less than l.This process can be carried out with recurrence, Will A0, A1, B0, B1Again decompose.In Karatsuba-Ofman method, multinomial successively repeatedly decomposes, when performing to calculate Need to rely on the result of calculation of subsequent decomposition, it is impossible to accomplish that the data of each calculating process are uncorrelated, therefore cannot be at parallelization Reason, it is impossible to embody multi-core parallel computation advantage.It addition, multinomial cutting procedure realizes complex, the direct shadow of dividing method Ringing execution efficiency, for less m value, load (overhead) is bigger.
First the calculating process of optimal extension field element multiplication can be understood as two multinomials carries out common multinomial and takes advantage of Method, then performs mould f (z) computing.Owing to f (z) has f (z)=zm--the specific form of ω, can be by when calculating mould f (z) All zmIt is replaced with ω, thus can avoid polynomial division, thus accomplish to drop item by item time.
Such as: at irreducible function f (z)=z5-2 and p=31 is in the optimal extension field of parameter, calculates two units of a and b The multiplication of element, wherein:
A=z4+5z2+3z+7
B=9z2+z
Its ordinary polynomials multiplication result is c=9z6+z5+14z4+z3+7z2(coefficient module p), fall time is by institute to+7z item by item Some z5Replace with 2, i.e. c=9z × z5+z5+14z4+z3+7z2+ 7z=18z+2+14z4+z3+7z2+ 7z=14z4+z3+7z2+ 25z+2
On the elliptic curve of optimal extension field structure, the most again or the realization of additive operation, needs call up to tens Secondary optimal extension field element computing.Therefore, that improves the execution speed of optimal extension field computing, especially multiplying performs speed Degree, has very important significance for improving the execution efficiency of elliptic curve cryptosystem, finds optimum fast and effectively expansion Field element computational methods are the most necessary.
In current optimal extension field computational methods, major part is based on serial algorithm.Android platform uses this During a little method, utilization is the calculating resource of CPU.The parallel computing CUDA framework used on PC platform cannot Android system uses.
RenderScript is to run 3D in a set of Android platform to render and process the programming frame of intensive calculating task Frame, be mainly directed towards is the calculating task with parallel data processing feature.The operating mechanism of RenderScript is can be by Calculate tasks in parallel, assign them to all available processor units in mobile device, CPU, GPU of such as multinuclear or DSP.When developer utilizes RenderScript programming framework to develop, it is also possible to ignore the framework difference of target device, because RenderScript code have employed compiling at runtime and caching technology, can automatically find and use all kinds of places on target device Reason device resource.If the target device of the program of operation does not has any GPU or DSP, then RenderScript engine can be appointed calculating Business transfers to CPU to complete completely, and therefore, RenderScript programming framework has high device independence and portability. RenderScript can improve the application journey of image processing class, computer vision class and high-performance calculation class significantly The speed of service of sequence, adds executive capability and the computing capability of Android native language.RenderScript uses c99 mark Standard, is the programming framework of a kind C grammer.From the beginning of Android4.3 version, RenderScript becomes in android system Unique parallel computation programming framework.
Use the conventional method of RenderScript programming framework can be summarized as following steps in Android platform:
First, the Android engineering created creates the calculating core document of the entitled rs of suffix, and is stored in Under the src catalogue of engineering, this document contains pragma statement, corresponding java class declaration and main body calculating function fixed Justice.It is the important means by calculating tasks in parallel that main body calculates function, and in Android application program, main body calculates function Upper strata java class object in being applied by Android calls in the way of many examples are concurrent.Each concurrent function example is independent Execution calculate task, it will usually access the internal storage location that is isolated from each other.
Second step be create under same catalogue for call main body calculate function upper strata java class, in order to in the first step Rs file set up contact.
Finally, create a RenderScript class object at Android application program, and then use this Object Creation the Upper strata java class object described in two steps, to its Resources allocation and initialize.By creating and using Allocation class pair As, data are swapped and replicated between java program internal memory space and RenderScript engine memory headroom. RenderScript class and Allocation class are all classes preset in Android development platform, as long as making in java program Import relevant bag with import order can use.
The present invention is improved by serial algorithm existing to optimal extension field, designs a kind of new storage organization and calculating Method, utilizes the RenderScript DLL in Android platform and parallel processing mechanism, it is achieved the most multinomial Formula modular multiplication.
Summary of the invention
The technical problem to be solved is that serial algorithm existing to optimal extension field improves, and design is a kind of new Storage organization and computational methods, utilize the RenderScript DLL in Android platform and parallel processing mechanism, real Now quick parallel polynomial modular multiplication.
For solving above-mentioned technical problem, the invention provides the fast parallel calculating side of a kind of optimal extension field element multiplication Method, comprising:
The first step, for optimal extension field element java class arithmetical unit of the multiplying design specialized of optimal extension field element;
Second step, the multiplication in design RenderScript calculates main body function and fall time calculates main body function, the two Function by java class object by the calculating core of RenderScript enforcement engine concurrent invocation, during defined function for It it is the first address of single internal storage location or a collection of internal storage location with homogeneity characteristic.
The described first step farther includes:
A walks, and for the java class of optimal extension field element computing design specialized, defines three class members's one-dimension array variablees, Two of which is used for storing two multinomials participating in computing (being multiplied), and another stores (multiplication) result of calculation;
B walks, and for optimal extension field element class arithmetical unit constructing definitions function, initializes each member and become in constructed fuction Amount;
C walks, and defines polynomial multiplication method for optimal extension field element class arithmetical unit, utilizes Renderscript to program frame Memory management interfaces bind that frame provides, passes to RenderScript by Allocation class object corresponding for two multipliers and holds Row engine, is saved in two array type variablees of RenderScript enforcement engine memory headroom.
Described second step specifically includes further:
A ' step, the multiplication in design RenderScript calculates main body function, defines two parameters: one is to use One element of array of Allocation object storage;Another parameter is the side-play amount of parallel calling, by RenderScript Enforcement engine automatic assignment, for each invoked main body function example, the value of its offset parameter is different, main Body function determines this unit position in result array according to the value of side-play amount, so that it is determined that the computational methods of this cell value;
B ' step, the fall time in design RenderScript calculates main body function, defines two parameters: one is to use One element of array of Allocation object storage;Another parameter is the side-play amount of parallel calling, by RenderScript Enforcement engine automatic assignment;
C ' step, design multiplication calculates in main body function it can be avoided that the mould p multiplication of Overflow handling is machine-processed.
Described a step is specially further three Allocation class objects of definition as class members's variable, for Renderscrip engine transmission data, define three class members's lint-long integer variablees, and optimal storage expands field parameter m, ω and p.
Described b step is specially the value according to parameter m further, is respectively created equal-sized two Allocation classes Object, creates the 3rd the Allocation class object being used for storing result of calculation.
Described c step is the most specially called the multiplication in RenderScript and is calculated main body function concurrent operation, so After call the fall in RenderScript time and calculate main body function and complete mould irreducible function computing.Finally, use Memory management interfaces copyTo that Renderscript programming framework provides is by storage in the 3rd Allocation class object Result of calculation is passed back in the member variable of java class.
In described a ' step, multiplication calculates in main body function, determines according to side-play amount and reads from two multiplier multinomials Take the term coefficient of correspondence, carry out mould p and be multiplied and addition obtains cell value result.
In described b ' step, during fall time calculates the execution of main body function, determine this list according to the value of side-play amount Unit position in result array, and then judge that its respective items number of times, whether more than m, is less than m for those respective items number of times Item find the number of times item corresponding element in array identical with oneself after fall time, be stored in this unit with it after being added.
In c ' step, when calculating two numbers s and r mould p multiplication, multiplier s is expressed as binary form;Use one Accumulator variable t to arrange its initial value be 0;Right-to-left travels through the binary string of s by turn, often accesses one and judges once should Whether position is 1, if r value is added up into t by 1, and is preserved by t mould p, then r value is set to r+r mould p;Treat that traversal completes After, s with r that be deposited in t is multiplied the result of mould p, if machine word-length is w position, as long as the value of p is not more thanw-1, above-mentioned Process would not produce overflow problem, and the end value of the most all computings is all without the expression scope more than a word length.
Beneficial effects of the present invention:
A kind of based on renderscript programming framework the optimal extension field element multiplying that the present invention provides is counted parallel Calculation method, meets: (1) quickly realizes optimal extension field element multiplying;(2) can be real on any android system equipment Existing.
Accompanying drawing explanation
Fig. 1 multiplication concurrent operation data cell schematic diagram;
Time calculating main body function data cell schematics drops in Fig. 2.
Detailed description of the invention
The invention provides the fast parallel computational methods of a kind of optimal extension field element multiplication, comprising:
The first step, for optimal extension field element java class arithmetical unit of the multiplying design specialized of optimal extension field element;
Second step, the multiplication in design RenderScript calculates main body function and fall time calculates main body function, the two Function by java class object by the calculating core of RenderScript enforcement engine concurrent invocation, during defined function for It it is the first address of single internal storage location or a collection of internal storage location with homogeneity characteristic.
The described first step farther includes:
A walks, and for the java class of optimal extension field element computing design specialized, defines three class members's one-dimension array variablees, Two of which is used for storing two multinomials participating in computing (being multiplied), and another stores (multiplication) result of calculation;
B walks, and for optimal extension field element class arithmetical unit constructing definitions function, initializes each member and become in constructed fuction Amount;
C walks, and defines polynomial multiplication method for optimal extension field element class arithmetical unit, utilizes Renderscript to program frame Memory management interfaces bind that frame provides, passes to RenderScript by Allocation class object corresponding for two multipliers and holds Row engine, is saved in two array type variablees of RenderScript enforcement engine memory headroom;
A ' step, the multiplication in design RenderScript calculates main body function, defines two parameters: one is to use One element of array of Allocation object storage;Another parameter is the side-play amount of parallel calling, by RenderScript Enforcement engine automatic assignment, for each invoked main body function example, the value of its offset parameter is different, main Body function determines this unit position in result array according to the value of side-play amount, so that it is determined that the computational methods of this cell value;
B ' step, the fall time in design RenderScript calculates main body function, defines two parameters: one is to use One element of array of Allocation object storage;Another parameter is the side-play amount of parallel calling, by RenderScript Enforcement engine automatic assignment;
C ' step, design multiplication calculates in main body function it can be avoided that the mould p multiplication of Overflow handling is machine-processed.
Described a step is specially further three Allocation class objects of definition as class members's variable, for Renderscrip engine transmission data, define three class members's lint-long integer variablees, and optimal storage expands field parameter m, ω and p.
Described b step is specially the value according to parameter m further, is respectively created equal-sized two Allocation classes Object, creates the 3rd the Allocation class object being used for storing result of calculation.
Described c step is the most specially called the multiplication in RenderScript and is calculated main body function concurrent operation, so After call the fall in RenderScript time and calculate main body function and complete mould irreducible function computing.Finally, use Memory management interfaces copyTo that Renderscript programming framework provides is by storage in the 3rd Allocation class object Result of calculation is passed back in the member variable of java class.
In described a ' step, multiplication calculates in main body function, determines according to side-play amount and reads from two multiplier multinomials Take the term coefficient of correspondence, carry out mould p and be multiplied and addition obtains cell value result.
In described b ' step, during fall time calculates the execution of main body function, determine this list according to the value of side-play amount Unit position in result array, and then judge that its respective items number of times, whether more than m, is less than m for those respective items number of times Item find the number of times item corresponding element in array identical with oneself after fall time, be stored in this unit with it after being added.
Such as: for multiplication result multinomial c=9z above6+z5+14z4+z3+7z2+ 7z, it corresponds to Allocation object storage coefficient array in each element value be respectively 9,1,14,1,7,7,0}, each element quilt One fall time calculates main body function individual processing, owing to the parameter of its optimal extension field is f (z)=zb-2 and p=31, therefore only have 14,1,7,7,0 these elements are processed by function.If certain fall time calculates, main body function is called and offset parameter is 1, then This function instance processes be exactly 7z this, according to above-mentioned rule, this function example can store from Allocation object Coefficient array finds 9z6This coefficient value 9, and be added with its mould p, write the result in currentElement.Obtain 16z's Coefficient 16.
In c ' step, in multiplication calculates main body function, need to frequently use the mould p multiplying of multinomial coefficient, to the greatest extent In the range of pipe can be selected in the word length less than a word choosing p value when (such as 64), but two FpIn The product of element is likely to the word length scope beyond a word.Therefore simple multiplying cannot be used to realize multinomial coefficient Multiplication operation.The present invention adopts and solves this problem with the following method: when calculating two numbers s and r mould p multiplication, represented by multiplier s For binary form;Use accumulator variable t and to arrange its initial value be 0;Right-to-left travels through the binary system of s by turn String, often accesses one and judges once whether this position is 1, if r value is added up into t by 1, and preserved by t mould p, then r value set It is set to r+r mould p;After traversal completes, s with r that be deposited in t is multiplied the result of mould p.If machine word-length is w position, as long as The value of p is not more than 2w-1, and said process would not produce overflow problem, and the end value of the most all computings is all without more than one The expression scope of word length.
The present invention provides can provide new operation method to the optimal extension field element multiplication in Android platform, logical Cross parallel method and significantly improve calculated performance.
During processing coefficient mould p multiplication, avoid intermediate object program carry overflow problem, thus simplify and calculated Journey, has saved the calculating time.
The optimal extension field multiplying that the present invention uses serial approach to realize with Android platform compares, test Finding, parallel method used in the present invention is with the obvious advantage.
Embodiment is below used to describe embodiments of the present invention in detail, whereby to the present invention how application technology means Solve technical problem, and the process that realizes reaching technique effect can fully understand and implement according to this.
The fast parallel computational methods of a kind of optimal extension field element multiplication that the present invention provides, step one, for optimal extension field Optimal extension field element java class arithmetical unit of the multiplying design specialized of element.By using import to order in class definition file Order import android.renderscript.Allocation, android.renderscript.Element and Tri-program bags of android.renderscript.RenderScript, in order to use built-in RenderScript object.Fortune The definition detailed process calculating device java class is:
1.1) optimal extension field element java arithmetical unit apoplexy due to endogenous wind definition member's variable number group a and b, is used for storing participation multiplication fortune Two the optimal extension field elements calculated, definition member's variable number group c is used for storing multiplication result.Each element of three arrays It it is all the integer of a length of 64 bits.Additionally define the member variable of three integer types, be used for recording above-mentioned array space hold Situation, the subscript maximum the most used.Defining three class members's lint-long integer variablees, optimal storage expands field parameter m, ω and p.
For an optimal extension field element, its multinomial coefficient is stored in array according to the order from low order to high order.Such as:
Multinomial 9x6+5x4+ 6x+3 storage mode in array is a []={ 9,0,5,0,0,6,3}
Defined in optimal extension field element java arithmetical unit class, the member variable of 3 Allocation object types, uses respectively Come to RenderScript computing engines transmission multiplier and operation result data.When arithmetical unit, class object created, need to specify Optimal extension field parameter m, ω and p.
1.2) definition optimal extension field element multiplication computational methods: utilize the internal memory pipe that Renderscript programming framework provides Reason interface bind, is assigned to Renderscript by the content of Allocation class object corresponding for two multipliers and calculates main body letter Two aray variables in number memory headroom.
The multiplication that calls using the forEach_functionName interface concurrent of ScriptC_mono class object calculates main Body function, functionName is the title that multiplication calculates main body function.Using the 3rd Allocation class object as parameter This invoked procedure incoming so that the unit of account as each concurrent function example of each element independence therein.Computing is complete Cheng Hou, store in the 3rd Allocation class object is exactly the multiplication result of ordinary polynomials multiplication.
The fall time of calling using the forEach_functionName interface concurrent of ScriptC_mono class object calculates main Body function, functionName is the title that fall time calculates main body function.Still using the 3rd Allocation class object as Parameter this invoked procedure incoming so that the unit of account as each concurrent function example of each element independence therein.Fortune After calculation completes, in front m the element of the 3rd Allocation class object array, storage is exactly mould f (z) multiplication result of calculation. After computing completes, result is stored in member variable array c and returns.
The mode wherein using forEach_functionName calls main body function, can allow multiple main body function example Called by concurrent, thus reach the purpose of boosting algorithm execution efficiency.
Multiplication in step 2, design RenderScript calculates main body function and fall time calculates main body function, the two Function by java class object by the calculating core of RenderScript enforcement engine concurrent invocation, during defined function for It it is the first address of single internal storage location or a collection of internal storage location with homogeneity characteristic.
2.1) multiplication calculates main body function
Multiplication in RenderScript calculates the parameter of main body function two, and one is to use Allocation object An element in array;Another parameter is the side-play amount of parallel calling, and the array element of i.e. first parameter instruction is whole Side-play amount in individual array.Owing to this main body function is by RenderScript enforcement engine concurrent invocation, therefore first Parameter need by Allocation object transmission internal memory obtain, and second parameter be when concurrent invocation by RenderScript enforcement engine automatic assignment.
Multiplication calculates the concrete step that performs of main body function: for side-play amount x, find all from two multiplier arrays The subscript sum element equal to x to < A [i], B [j] >, by A [i] and the B [j] of these element centerings after mould p is multiplied respectively again Mould p is added, and result is stored in currentElement i.e. C [x].Accompanying drawing 1 is the multiplication concurrent operation data cell schematic diagram of the present invention.
The execution process of its middle mold p multiplication operation is:
Calculate A [i] * B [j] modp, B [j] is expressed as binary form (bw-1, bw-2..., b2, b1, b0), wherein w is Word length.In order to prevent the spilling in multiplication process, we select p to make p be not more than 2w-1.Initialize accumulator variable t also Arranging its initial value is 0;If the lowest order of B [j] is 1, then A [i] and t mould p is added, and result is saved in t.By A [i] is stored in after being added with self mould p in A [i], repeats said process until B [j] is equal to zero after B [j] is moved to right one.
2.2) fall time calculates main body function
The parameter of this calculating main body function in RenderScript has two, and one is to use Allocation object to deposit The multiplication of storage calculates an element in main body Function Array C mouth, and another parameter is the side-play amount of parallel calling, by RenderScript enforcement engine automatic assignment.If offset value x is less than optimal extension field parameter m, then by C [X+m] element value with Optimal extension field parameter ω mould p is multiplied, and then the value with currentElement C [x] carries out mould p and is added, and result is stored in C [x].
Accompanying drawing 2 is the data cell schematic diagram that fall time calculates that main body function performs to calculate.
Fall time calculates main body function and multiplication calculates main body function and all carries out computing, and its just for a data cell Calculating process is completely irrelevant with the calculating process of other any data cells, and its scheduling mode is that multiple function example is adjusted parallel Degree.The function example of multiple concurrent schedulings all accesses with read-only mode when accessing shared data region, does not cause any conflict Or the inconsistent situation of data.
The optimal extension field multiplying that the present invention uses serial approach to realize with Android platform compares, test Finding, parallel method used in the present invention is with the obvious advantage.Different brands handpiece portion measured data is as shown in table 2 and table 3:
The test data of the different mobile phone of table 2
Table 3 mobile phone configuring condition
All above-mentioned primary these intellectual properties of enforcement, do not set this new product of enforcement limiting other forms And/or new method.Those skilled in the art will utilize this important information, and foregoing is revised, to realize similar execution feelings Condition.But, all modifications or transformation belong to the right of reservation based on new product of the present invention.
The above, be only presently preferred embodiments of the present invention, is not the restriction that the present invention makees other form, appoints What those skilled in the art changed possibly also with the technology contents of the disclosure above or be modified as equivalent variations etc. Effect embodiment.But every without departing from technical solution of the present invention content, the technical spirit of the foundation present invention is to above example institute Any simple modification, equivalent variations and the remodeling made, still falls within the protection domain of technical solution of the present invention.

Claims (9)

1. the fast parallel computational methods of an optimal extension field element multiplication, it is characterised in that including:
The first step, for optimal extension field element java class arithmetical unit of the multiplying design specialized of optimal extension field element;
Second step, the multiplication in design RenderScript calculates main body function and fall time calculates main body function, the two function By the java class object calculating core by RenderScript enforcement engine concurrent invocation, during defined function, it is directed to list Individual internal storage location or there is the first address of a collection of internal storage location of homogeneity characteristic.
2. the fast parallel computational methods of optimal extension field element multiplication as claimed in claim 1, it is characterised in that described first Step farther includes:
A walks, and for the java class of optimal extension field element computing design specialized, defines three class members's one-dimension array variablees, wherein Two are used for storing two multinomials participating in computing (being multiplied), and another stores (multiplication) result of calculation;
B walks, and for optimal extension field element class arithmetical unit constructing definitions function, initializes each member variable in constructed fuction;
C walks, and defines polynomial multiplication method for optimal extension field element class arithmetical unit, utilizes Renderscript programming framework to carry Memory management interfaces bind of confession, passes to Allocation class object corresponding for two multipliers RenderScript and performs to draw Hold up, be saved in two array type variablees of RenderScript enforcement engine memory headroom.
The fast parallel computational methods of excellent expansion field element multiplication the most as claimed in claim 1 or 2, it is characterised in that described Two steps farther include:
A ' step, the multiplication in design RenderScript calculates main body function, defines two parameters: one is to use One element of array of Allocation object storage;Another parameter is the side-play amount of parallel calling, by RenderScript Enforcement engine automatic assignment, for each invoked main body function example, the value of its offset parameter is different, main Body function determines this unit position in result array according to the value of side-play amount, so that it is determined that the computational methods of this cell value;
B ' step, the fall time in design RenderScript calculates main body function, defines two parameters: one is to use One element of array of Allocation object storage;Another parameter is the side-play amount of parallel calling, by RenderScript Enforcement engine automatic assignment;
C ' step, design multiplication calculates in main body function it can be avoided that the mould p multiplication of Overflow handling is machine-processed.
4. the fast parallel computational methods of the excellent expansion field element multiplication as described in claims 1 to 3, it is characterised in that: described a Step the most specially three Allocation class objects of definition are as class members's variable, for passing with Renderscrip engine Delivery data, defines three class members's lint-long integer variablees, and optimal storage expands field parameter m, ω and p.
5. the fast parallel computational methods of the excellent expansion field element multiplication as described in Claims 1-4, it is characterised in that: described b Step is specially the value according to parameter m further, is respectively created equal-sized two Allocation class objects, creates and is used for depositing 3rd Allocation class object of storage result of calculation.
6. the fast parallel computational methods of the excellent expansion field element multiplication as described in claim 1 to 5, it is characterised in that: described c Step is the most specially called the multiplication in RenderScript and is calculated main body function concurrent operation, then calls Fall in RenderScript time calculates main body function and completes mould irreducible function computing.Finally, Renderscript is used The result of calculation of storage in 3rd Allocation class object is passed back by memory management interfaces copyTo that programming framework provides In the member variable of java class.
7. the fast parallel computational methods of the excellent expansion field element multiplication as described in claim 1 to 6, it is characterised in that: described In a ' step, multiplication calculates in main body function, determines the term coefficient reading correspondence from two multiplier multinomials according to side-play amount, Carry out mould p to be multiplied and addition obtains cell value result.
8. the fast parallel computational methods of the excellent expansion field element multiplication as described in claim 1 to 7, it is characterised in that: described In b ' step, during fall time calculates the execution of main body function, determine that this unit is in result array according to the value of side-play amount Position, and then judge its respective items number of times whether more than m, for those respective items number of times item less than m find fall time with The corresponding element in array of item that rear number of times is identical with oneself, is stored in this unit with it after being added.
9. the fast parallel computational methods of the excellent expansion field element multiplication as described in claim 1 to 8, it is characterised in that: at c ' In step, when calculating two numbers s and r mould p multiplication, multiplier s is expressed as binary form;Use accumulator variable t also Arranging its initial value is 0;Right-to-left travels through the binary string of s by turn, often accesses one and judges once whether this position is 1, if It is 1 to add up r value into t, and t mould p is preserved, then r value is set to r+r mould p;After traversal completes, deposit in t is Be multiplied the result of mould p for s and r, if machine word-length is w position, as long as the value of p is not more than 2w-1, said process would not produce Raw overflow problem, the end value of the most all computings is all without the expression scope more than a word length.
CN201610305021.3A 2016-05-09 2016-05-09 A kind of fast parallel calculation method of optimal extension field element multiplication Expired - Fee Related CN106020949B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610305021.3A CN106020949B (en) 2016-05-09 2016-05-09 A kind of fast parallel calculation method of optimal extension field element multiplication

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610305021.3A CN106020949B (en) 2016-05-09 2016-05-09 A kind of fast parallel calculation method of optimal extension field element multiplication

Publications (2)

Publication Number Publication Date
CN106020949A true CN106020949A (en) 2016-10-12
CN106020949B CN106020949B (en) 2019-08-06

Family

ID=57098983

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610305021.3A Expired - Fee Related CN106020949B (en) 2016-05-09 2016-05-09 A kind of fast parallel calculation method of optimal extension field element multiplication

Country Status (1)

Country Link
CN (1) CN106020949B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101271570A (en) * 2008-05-07 2008-09-24 威盛电子股份有限公司 Apparatus and method for large integer multiplication operation
CN103731254A (en) * 2012-10-14 2014-04-16 张仁平 Correcting and applying system of fast algorithm library for number theory (NTL)
US20140244703A1 (en) * 2013-02-26 2014-08-28 Nvidia Corporation System, method, and computer program product for implementing large integer operations on a graphics processing unit

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101271570A (en) * 2008-05-07 2008-09-24 威盛电子股份有限公司 Apparatus and method for large integer multiplication operation
CN103731254A (en) * 2012-10-14 2014-04-16 张仁平 Correcting and applying system of fast algorithm library for number theory (NTL)
US20140244703A1 (en) * 2013-02-26 2014-08-28 Nvidia Corporation System, method, and computer program product for implementing large integer operations on a graphics processing unit

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ANTON KARGL,等: "Fast Arithmetic on ATmega128 for Elliptic Curve Cryptography", 《CRYPTOLOGY EPRINT ARCHIVE: REPORT 2008/442》 *
EUN-HEE GOO, 等: "A Study on Android-Based Real Number Field Elliptic Curve Key Table Generation", 《COMPUTER APPLICATION FOR SECURITY, CONTROL AND SYSTEM ENGINEERING》 *

Also Published As

Publication number Publication date
CN106020949B (en) 2019-08-06

Similar Documents

Publication Publication Date Title
Chen et al. ThunderGP: HLS-based graph processing framework on FPGAs
George et al. Sparse Cholesky factorization on a local-memory multiprocessor
Kumar et al. Scalable load balancing techniques for parallel computers
Chowdhury et al. Oblivious algorithms for multicores and networks of processors
CN112559163B (en) Method and device for optimizing tensor calculation performance
Sarıyüce et al. Regularizing graph centrality computations
Beaumont et al. A realistic model and an efficient heuristic for scheduling with heterogeneous processors
Zayer et al. A gpu‐adapted structure for unstructured grids
Bernaschi et al. A factored sparse approximate inverse preconditioned conjugate gradient solver on graphics processing units
Larusic et al. Experimental analysis of heuristics for the bottleneck traveling salesman problem
Duff et al. Experiments with sparse Cholesky using a sequential task-flow implementation
CN106020949A (en) Fast parallel calculation method of optimal extension field element multiplication
Deitz et al. Abstractions for dynamic data distribution
Curtis et al. An efficient solution to the subset‐sum problem on GPU
Van Reeuwijk et al. Spar: A set of extensions to Java for scientific computation
Igual et al. Scheduling algorithms‐by‐blocks on small clusters
Smith et al. Beyond time complexity: data movement complexity analysis for matrix multiplication
Weiss et al. Computation of matrix chain products on parallel machines
Jeannot Process mapping on any topology with TopoMatch
Hagerup et al. FORK: A high-level language for PRAMs
Barsamian et al. Efficient strict-binning particle-in-cell algorithm for multi-core SIMD processors
Carneiro et al. Productivity-aware design and implementation of distributed tree-based search algorithms
Chenhan et al. Performance models and workload distribution algorithms for optimizing a hybrid CPU–GPU multifrontal solver
Nedozhogin et al. Scalability Pipelined Algorithm of the Conjugate Gradient Method on Heterogeneous Platforms
Larsen Generating Efficient Code for Futhark’s Segmented Redomap

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190806

Termination date: 20210509

CF01 Termination of patent right due to non-payment of annual fee