WO2015112001A1 - System and method for arbitrary bit permutation using bit-separation and bit-distribution instructions - Google Patents

System and method for arbitrary bit permutation using bit-separation and bit-distribution instructions Download PDF

Info

Publication number
WO2015112001A1
WO2015112001A1 PCT/MY2015/000006 MY2015000006W WO2015112001A1 WO 2015112001 A1 WO2015112001 A1 WO 2015112001A1 MY 2015000006 W MY2015000006 W MY 2015000006W WO 2015112001 A1 WO2015112001 A1 WO 2015112001A1
Authority
WO
WIPO (PCT)
Prior art keywords
permutation
bits
permute
bit
log
Prior art date
Application number
PCT/MY2015/000006
Other languages
French (fr)
Inventor
Bin Mohamad Yusof MOHAMAD YUSRI
Binti Mohamad Yassin YASZRINA
Original Assignee
Mimos Berhad
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mimos Berhad filed Critical Mimos Berhad
Publication of WO2015112001A1 publication Critical patent/WO2015112001A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/76Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30018Bit or string instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30032Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • G06F9/30038Instructions to perform operations on packed data, e.g. vector, tile or matrix operations using a mask

Definitions

  • the present invention relates to system and methods for performing arbitrary permutations of a sequence of bits in a processor system based on pre-defined permutation instruction sequences.
  • the pre-defined permutation instruction sequences include bit separation instruction sequence for the sequence of bits in one instance and bit distribution instruction sequence for the sequence of bits in another instance.
  • Secure information processing using cryptography is becoming increasingly important.
  • High speed computing which is used in cryptography and baseband processing, demands high speed level signal processing.
  • Bit permutation operation is a form of bit manipulation to rearrange bits, which is conventionally used to handle general bit-level signal processing.
  • the need for secure information processing has increased with the increasing use of the public internet and wireless communications in e-commerce, e-business and personal use. Typical use of the internet is not secure.
  • Secure information processing typically includes authentication of users and host machines, confidentiality of messages sent over public networks, and assurances that messages, programs and data have not been maliciously changed.
  • Conventional solutions have provided security functions by using different security protocols employing different cryptographic algorithms, such as public key, symmetric key and hash algorithms.
  • bit puncturing is implemented with a number of shifts, bitwise AND and bitwise OR operations.
  • table lookup methods to implement fixed permutations.
  • a table with 2 n entries is used with each entry being n bits.
  • this type of table lookup would use 2 67 bytes, which is clearly infeasible.
  • the table can be broken up into smaller tables, and several table lookup operations could be used. For example, a 64-bit permutation could be implemented by permuting 8 consecutive bits at a time, then combining these 8 intermediate permutations into a final permutation.
  • this invention employs a general register processor (GPR) instruction set to perform permutations, which divide the source bits into two groups depending on configuration bits and in order to get the result of one GRP instruction the pair of groups are concatenated after the final permutation and this method does not follow any bit reversal instructions.
  • GPR general register processor
  • the arbitrary permutation of the source sequence bits is done as monotonically increasing sequences of a pair of bit groups and later merging them together to form a single group of bits rather than sorting the bits based on intrinsic property of the instructions.
  • the general register processor instruction is introduced by employing a data bits source register, a bit mask register, and a target register. Bits in data source register with corresponding 1-bits in the mask register are shifted to one part of register with least significant bit and those with corresponding 0-bits in the mask register are shifted to the part of register with most significant bit position of the target register. The relative positions of source bits within the two groups are retained. However, this method does not incorporate any bit reversal instructions for permutation configuration of bits.
  • the present disclosure presents system and method of performing arbitrary bit permutation using a sequence of log 2 (n) different permute processor instruction.
  • the permute processor instructions include bit separation instructions (BSEP) , which separates selected bits to one side in order and separates unselected bits to the other side in reverse order; and bit distribution instructions (BDST), which distributes consecutive bits on one side to selected bit positions in order and distributes the rest of the sequence of bits from the other side to unselected bit positions in reverse source bit order in minimal number of steps.
  • BSEP bit separation instructions
  • BDST bit distribution instructions
  • system and method for arbitrary bit permutation of a plurality of n-bits is disclosed.
  • system and method for arbitrary bit permutation using BSEP and BDST instructions disclose procedures for finding specific bit patterns in control words.
  • the control words are generated by a sequence of permute instructions for performing a desired n-bit permutation.
  • the control words provided by the permute instructions become parameters to a sequence of said instructions executing on specialized processor hardware configured to run based onsaid instructions, which can be used insolving permutation problems in cryptography, multimedia and other applications.
  • the system for arbitrary bit permutation of a plurality of bits in a plurality of instances comprises a permute- enhanced computing processor and a permute control words generator.
  • the permute-enhanced computing processor of the system for arbitrary bit permutation of a plurality of bits there is provided BDST instruction operating on two source registers and one target registers. Both the source registers and the target registers is n-bit wide. According to the present embodiment, the plurality of n-bits is distributed to get the output in the target register.
  • the BDST instruction distributes the sequence of bits from one part of one source register to selected bit positions in the target register in order and sequence of bits from the other part of the source register is moved to unselected bit positions in the target register in reverse order.
  • the positions of the plurality of selected and unselected bit positions are indicated in a control word residing in the other source register.
  • the control word is generated at each stage of the plurality of stages of bit permutation by setting bits in the control word at corresponding positions of the selected bits from an input sequence bits from the source register and clearing bits in the control word at corresponding positions of the unselected bits from the input.
  • a memory subsystem associated with the permute-enhanced processor is provided with subroutine to perform arbitrary bit permutation comprising a sequence of log 2 (n) BDST instructions wherein the subroutine being parameterized by a plurality of log 2 (n) control words.
  • BSEP instruction operating on two source registers and one target registers. Both the source registers and the target registers is n-bit wide. According to the present embodiment, the plurality of n-bits is separated to get the output in the target register.
  • the BSEP instruction separates selected bits from one source register to one part of the target register in order and unselected bits from said source register to the other part of the target register in reverse order. The selected and unselected bit positions are indicated in the control word residing in the other source register.
  • the control word is generated at each stage of a plurality of log 2 (n) stages by setting bits in the control word at corresponding positions of the selected bits from the input and clearing bits in the control word at corresponding positions of the unselected bits from the input.
  • the memory subsystem associated with the permute-enhanced processor is provided with subroutine to perform arbitrary bit permutation comprising a seguence of log 2 (n) BSEP instructions wherein the subroutine being parameterized by log 2 (n) control words.
  • the control words for the sequence of permute instruction to perform a desired n-bit permutation is provided by the permute control words generator.
  • the permute control words generator generates the plurality of control words which parameterizes a sequence of log 2 (n) instances of the plurality of permute instructions.
  • the plurality of log 2 (n) control words generated by the permute control words generator in a plurality of log 2 (n) stages is employed for the execution of the sequence of log 2 (n) instances of the plurality of permute instructions to obtain the desired permutation of the plurality of n-bits.
  • the sequence of instructions is done in log 2 (n) stages wherein the first stage takes as input the initial permutation and the last stage generates the ordered sequence, i.e., the identity permutation. Each intervening stage of the instructions takes the intermediate permutation output from the preceding stage as input.
  • each stage generates control word as a result of selecting one half of preceding permutation bitsin stage-predetermined manner to one side of the resulting permutation in order and moving the other half of the unselected preceding permutation bits to the other side of the resulting permutation in reverse order.
  • the initial permutation of the permute control words generator is set to the desired permutation and the resulting control words sequence are applied in reverse against the BDST instructions.
  • the initial permutation of the permute control words generator is set to the inverse of the desired permutation and the resulting control words sequence are applied in order against the BSEP instructions.
  • FIG. 1 illustrates a system for performing arbitrary bit permutation comprising permute-enhanced computing processor and permute control words generator;
  • FIG. 2 illustrates permute-enhanced computing processor enhanced with BSEP and BDST Permute unit
  • FIG. 3 illustrates a flowchart for generating control words for a sequence of BDST instructions
  • FIG. 4 illustrates a flowchart for generating different control words for a sequence of BSEP instructions
  • FIG. 5 exemplifies a BSEP operation in accordance with one embodiment of the present invention
  • FIG. 6 exemplifies a BDST operation in accordance with one embodiment of the present invention
  • FIG. 7 illustrates a permutation P comprising a plurality of n-elements of N elements.
  • FIG. 8 illustrates a plurality of steps for a de-permutation of 16 elements using BSEP operation
  • FIG. 9 illustrates a plurality of steps to find an inverse permutation of 16 elements
  • FIG. 10 exemplifies a de-permuting 16-bit inverse permutation steps
  • FIG. 11 exemplifies a permuting 16-bit configuration using BSEP operation permutation steps.
  • FIG 1 illustrates the system (1103) for performing arbitrary bit permutation comprising a permute-enhanced computing processor (1101) and a permute control words generator (1102) .
  • the permute-enhanced computing processor (1101) processes a plurality permute instructions for performing arbitrary permutation of a plurality of n-bits.
  • the at least one memory subsystem unit (1201) stores a plurality of data and the plurality permute instructions of a bit permute program.
  • At least one instruction fetch unit (1202) is employed for loading the plurality permute instructions of the program from the memory subsystem unit (1201) and a decoder unit (1203) decodes the plurality permute instructions of the program fetched by the at least one instruction fetch unit (1202) .
  • the decoder unit (1203) controls a plurality of subsequent units during execution of the plurality permute instructions of the program.
  • At least one register file (1205) associated with the decoder unit (1203) includes a plurality of n-bit registers for performing as a plurality of operands during execution of the plurality permute instructions of the bit permutation program.
  • the bit permutation program includes operations for either bit separation or bit distribution instructions.
  • the system (1103) further includes at least one load-store unit (1204) for loading the plurality of operands into the at least one register file (1205) from the at least one memory subsystem unit (1201) and for storing a plurality of resulting operands in the at least one register file (1205) to the memory subsystem unit (1201).
  • the permute-enhanced computing processor (1101) includes a permute unit (1208) for execution of the plurality of permute instructions for performing arbitrary permutation on the plurality of operands, wherein the plurality of operands includes a source operand to permute, a control word operand for identifying a first group of the source operand bits from a second group of the source operand bits and a target operand for storing a permute instruction result.
  • the permute- enhanced computing processor (1101) further includes at least one subroutine in the memory subsystem unit (1201) .
  • the at least one subroutine consisting of a sequence of logs(n) instances of the plurality of permute instructions parameterized by a plurality of control words generated by the permute control words generator (1102).
  • the permute control words generator (1102) generates the plurality of log 2 (n) control words in a plurality of log 2 (n) stages for execution of the sequence of log 2 (n) instances of the plurality of permute instructions to obtain the desired permutation of the plurality of n-bits.
  • a first stage among the plurality of log 2 (n) stages selects an initial permutation to be an input and each subsequent stage of the plurality of log 2 (n) stages generates an intermediate permutation output by selecting a plurality of predetermined bits from the input to a first side of the intermediate permutation output and by moving a sequence of unselected bits from the input to a second opposite side of the intermediate permutation output.
  • Each stage except the first stage of the plurality of log 2 (n) stages selects the intermediate permutation output from a preceding stage to be the input.
  • the control word is generated at each stage of the plurality of log 2 (n) stages by setting bits in the control word at corresponding positions of the selected bits from the input and clearing bits in the control word at corresponding positions of the unselected bits from the input .
  • the plurality of permute instructions to the permute-enhanced computing processor (1101) for performing permutation on the plurality of operands include a BDST instruction operating on two source registers and one target registers each register being n-bit wide.
  • the BDST instruction distributes sequence of bits from one part of one source register to selected bit positions in the target register in order and sequence of bits from the other part of the said source register to unselected bit positions in the target register in reverse order.
  • the positions of the plurality of selected and unselected bit positions are indicated in a control word residing in the other source register.
  • the control word is generated at each stage of the plurality of stages of bit permutation by setting bits in the control word at corresponding positions of the selected bits from an input sequence bits from the source register and clearing bits in the control word at corresponding positions of the unselected bits from the input.
  • a memory subsystem associated with the permute- enhanced processor is provided with subroutine to perform arbitrary bit permutation comprising a sequence of log 2 (n) BDST instructions wherein the subroutine being parameterized by a plurality of log 2 (n) control words.
  • the plurality of permute instructions to the permute-enhanced computing processor (1101) for performing permutation on the plurality of operands include a, BSEP instruction operating on two source registers and one target registers, each register being n-bit wide.
  • the plurality of n-bits is separated to get the output in the target register.
  • the BSEP instruction separates selected bits from one source register to one part of the target register in order and unselected bits from said source register to the other part of the target register in reverse order. The selected and unselected bit positions are indicated in the control word residing in the other source register.
  • the control word is generated at each stage of a plurality of log 2 (n) stages by setting bits in the control word at corresponding positions of the selected bits from the input and clearing bits in the control word at corresponding positions of the unselected bits from the input.
  • the memory subsystem associated with the permute-enhanced processor is provided with subroutine to perform arbitrary bit permutation comprising a sequence of log 2 (n) BSEP instructions wherein the subroutine being parameterized by log 2 (n) control words.
  • the permute-enhanced computing processor (1101) illustrated in FIG. 2 is enhanced with BSEP and BDST instruction implementation in the BSEP and BDST Permute unit (1208) .
  • the memory subsystem unit (1201) comprises subroutine comprising sequence of log 2 (n) BDST instructions and control word sequences for the BDST subroutine.
  • the memory subsystem unit (1201) comprises subroutine comprising sequence of log 2 (n) BSEP instructions and control word sequences for the BSEP subroutine.
  • the memory subsystem unit (1201) comprises the source data to be permuted.
  • the load-store unit (1204) loads source data to be permuted in addition to the control word sequence for the desired permutation into the register file (1205).
  • a permute instruction in the sequence of permute instructions is loaded and decoded for execution by the instruction fetch unit (1202) and decoder unit (1203), respectively, the source register to be permuted and register with corresponding control word is loaded into the BSEP and BDST Permute unit (1208), the decoded permute instruction is executed and the resulting intermediate bit permutation is stored into the target register in the register file (1205).
  • the target register will become the source register to be permuted for the next permute instruction in the sequence of permute instructions.
  • the permute-enhanced computing processor (1101) includes an ALU (Arithmetic Logic Unit) (1207) common in computing processor to support the processing of other instructions like arithmetic and bit-wise operations.
  • ALU Arimetic Logic Unit
  • FIG. 3 depicts flowchart for generating control words for a sequence of log 2 (n) BDST instructions, the initial incoming permutation (104) of the permute control words generator is set to the determined permutation (102) and the resulting control words sequence are applied in reverse against the BDST instructions (118) .
  • the stage comprises of selecting half of the incoming permutation bits in stage-predetermined manner to one part of the resulting permutation in order (106), moving unselected bits to the other part of said resulting permutation in reverse order (108), setting corresponding selected bits to 1 and unselected bits to 0 in the control word for the stage (110) and feeding the resulting permutation as the incoming permutation for the next stage (112) .
  • set bits in the control word for the last stage corresponding to first half of source bit positions in the incoming permutation to 1 and set the other bits in the control word to 0 (116) .
  • FIG. 4 depicts flowchart for generating control words for a sequence of log 2 (n) BSEP instructions, the initial incoming permutation (122) of the permute control words generator is set to the inverse of the determined permutation (120) and the resulting control words sequence are applied in order against the BSEP instructions (136) .
  • the stage comprises selecting half of the incoming permutation bits in stage-predetermined manner to one part of the resulting permutation in order (124), moving unselected bits to the other part of said resulting permutation in reverse order (126), setting corresponding selected bits to 1 and unselected bits to 0 in the control word for the stage (128) and feeding the resulting permutation as the incoming permutation for the next stage (130) .
  • the last stage set bits in the control word for the last stage corresponding to first half of source bit positions in the incoming permutation to 1 and set the other bits in the control word to 0 (134) .
  • the BSEP operation is an 8-bit operation comprising three register operands: a source register (204), a result register (202) and a mask register (206) .
  • the bits, i.e. bi, b 3 , b and b 7 , of the source register (204) is distributed in normal order to one sideof the result register (202) that are marked with 1-bits in the control wordregister (206) and the remaining bits i.e. bo, b 2 , b 5 and bg, of the source register (204) is distributed in reverse order to the other side of the result register (202) marked with 0-bits in the control word register (206) .
  • FIG. 6 exemplifies a BDST operation in accordance with one embodiment of the present invention.
  • the operation is the reverse of the BSEP operation.
  • BDST operation distributes consecutive bits in result register (202) on one side as indicated by the l's in the mask register (206) in source bit order and the rest of the bits on the other side as indicated by the 0's in the mask register (206) in reverse source bit order.
  • These instructions are useful for some easily identifiable permutations. For example, bit order reversal would require a single instruction.
  • it may be doneusing two BSEP instructions or two BDST instructions, respectively.
  • a method from embodiments of the present invention is required by which a sequence of log 2 (n) of permute instructions is used to perform an arbitrary permutation which are used extensively in coding techniques, for example, convolutional coding and convolut ional turbo coding.
  • the control words for the sequence of permute instruction to perform a desired n-bit permutation is provided by the permute control words generator (1102) further comprising log 2 (n) stages wherein the first stage takes as input the initial permutation and the last stage generates the ordered sequence, i.e., the identity permutation. Each intervening stage takes as input the intermediate permutation output from the preceding stage. Furthermore, each stage generates control word as a result of selecting one half of preceding permutation bits in stage-predetermined manner to one side of the resulting permutation in order and moving the other half of the unselected preceding permutation bits to the other side of the resulting permutation in reverse order. Stage-predetermined selection for stage i, except the last stage, of one half of the n bits to move to one part of the resulting permutation in order is given by the following bit positions in base-2 numbers and
  • FIG. 7 A method of bit permutation against some arbitrary permutation using a general 2-way split operation in log 2 (n) steps is described in FIG. 7, where n is the number of elements in the permutation.
  • FIG. 7 illustrates a permutation P comprising of Nelements.
  • Each step consists of simple intermediate permutation wherein step-specific N/2 elements from the instructions before are grouped into the first half of the intermediate permutation result for the step.
  • N/2 elements from N elements of P are grouped into the first N/2 positions.
  • N/4 elements are taken from the first N/2 positions and grouped into the first N/4 positions.
  • the next N/4 elements are taken from the second N/2 positions and grouped into the second N/4 position. Similar subdivision and operation is performed until the last step.
  • 1 element is taken from each pair from the previous step and placed in the first N/2 positions, in order.
  • step pick half of its elements to make a new ( ⁇ ) -tuple.
  • Each 2-tuple actually comprises an integer and its bitwise complement.
  • notation ⁇ ...> is used to denote a set of integers comprising those explicitly specified in the angle brackets and their bitwise complements. The above could be written as permutation,
  • FIG. 8 to FIG. 11 illustrates, the stages in the control words generation are listed in performing arbitrary 16-bit permutation .
  • the processor picks elements ⁇ 0, 4, 8, 12, 15, 11, 7, 3 ⁇ from the original permutation into the first half of the resulting permutation preserving relative order.
  • stage 1 the processor picks elements ⁇ 0,1,8,9,15,14,7,6 ⁇ from the preceding permutation into the first half of the resulting permutation preserving relative order.
  • stage 2 the processor picks elements
  • FIG.8 shows how the stages above are used to de-permute permutation (12,10,3,15,1,0,9,5,11,7,4,6,2,13,8,14) using the prescribed method.
  • the control word for each stage can be deduced from FIG.8.
  • the highlighted elements in the permutation are assigned bit value '1' and the other elements are assigned bit value ' ⁇ ' in the control word.
  • the least significant bit (lsb) is chosen to start from the left. In programming source code, values are typically written with the lsb from the right.
  • 16-bit permutation can be performed using BDST using the steps in reverse direction.
  • control [4]
  • control words are then used in reverse steps using BDST to perform the actual permutation.
  • steps of BSEP can be summarizedas a mapping from an arbitrary permutation P to identity permutation, i.e., ordered sequence, I.
  • the mapping is denoted as:
  • FIG.9 shows an illustration for finding inverse permutation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Advance Control (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiments herein disclose a system and method for arbitrary bit permutation. According to one embodiment a method involving log2 (n) instances of bit distribution (BDST) instruction and in another embodiment involving log2 (n) instances of bit separation (BSEP) instructions to perform an arbitrary bit permutation on a programmable processor is disclosed. In one instance of the embodiment, the permute instruction separates selected bits to one side in order and unselected bits to the other side in reverse order and in an another instance of the embodiment, the permute instruction distributes sequence of bits from one side to selected bit positions in order and sequence of bits from the other side to unselected bit positions in reverse order.

Description

System and method for arbitrary bit permutation using bit- separation and bit-distribution instructions
Field of Invention
The present invention relates to system and methods for performing arbitrary permutations of a sequence of bits in a processor system based on pre-defined permutation instruction sequences. In particular, the pre-defined permutation instruction sequences include bit separation instruction sequence for the sequence of bits in one instance and bit distribution instruction sequence for the sequence of bits in another instance.
Background of the Invention
Secure information processing using cryptography is becoming increasingly important. High speed computing, which is used in cryptography and baseband processing, demands high speed level signal processing. Bit permutation operation is a form of bit manipulation to rearrange bits, which is conventionally used to handle general bit-level signal processing. The need for secure information processing has increased with the increasing use of the public internet and wireless communications in e-commerce, e-business and personal use. Typical use of the internet is not secure. Secure information processing typically includes authentication of users and host machines, confidentiality of messages sent over public networks, and assurances that messages, programs and data have not been maliciously changed. Conventional solutions have provided security functions by using different security protocols employing different cryptographic algorithms, such as public key, symmetric key and hash algorithms.
Some examples of conventional signal coding that involves bit permutation are data encryption standard, and bit puncturing and interleaving in forward error correction. Bit puncturing is implemented with a number of shifts, bitwise AND and bitwise OR operations.
General bit permutation would require a large number of operations and associated steps. In order to perform a general 128- bit permutation, a sequence of 4 x 128 = 512 operations are generally required. That would take up a lot of resource for applications such as wireless broadband baseband processing.
Some other conventional techniques have used table lookup methods to implement fixed permutations. To achieve a fixed permutation of n input bits with one table lookup, a table with 2n entries is used with each entry being n bits. For a 64-bit permutation, this type of table lookup would use 267 bytes, which is clearly infeasible. Alternatively, the table can be broken up into smaller tables, and several table lookup operations could be used. For example, a 64-bit permutation could be implemented by permuting 8 consecutive bits at a time, then combining these 8 intermediate permutations into a final permutation. This technique, however, suffers from requiring special hardware to do a fast and full large table lookups and in some cases the performance may significantly deteriorate because of intermittent cache misses while performing large table lookups . A system and method for performing bit permutation by using bit separation instructions for solving permutation problems in cryptography, multimedia is disclosed in U.S patent number US 7,174,014 B2 issued to Lee. Permutation instructions according to this invention can be employed for arbitrary permutation of bits. However, this invention employs a general register processor (GPR) instruction set to perform permutations, which divide the source bits into two groups depending on configuration bits and in order to get the result of one GRP instruction the pair of groups are concatenated after the final permutation and this method does not follow any bit reversal instructions. In addition, the arbitrary permutation of the source sequence bits is done as monotonically increasing sequences of a pair of bit groups and later merging them together to form a single group of bits rather than sorting the bits based on intrinsic property of the instructions.
In yet another implementation, the general register processor instruction is introduced by employing a data bits source register, a bit mask register, and a target register. Bits in data source register with corresponding 1-bits in the mask register are shifted to one part of register with least significant bit and those with corresponding 0-bits in the mask register are shifted to the part of register with most significant bit position of the target register. The relative positions of source bits within the two groups are retained. However, this method does not incorporate any bit reversal instructions for permutation configuration of bits.
Another prior patent US 6,952,478 B2 issued to Lee discloses bit-level permutation instructions for solving permutation problems in cryptography. Both the methods disclosed in this patent can be used for providing arbitrary permutation of bits by employing a virtual omega-flip interconnection network. This method includes transforming source sequence of bits into intermediate sequences of bits and repeating the same steps till a desired sequence of bits is obtained using sequences of permutation instructions in anomega-flip network, whereas the proposed invention which employs different permute processor instructions for bit permutation .
Another non-blocking, conflict free routing algorithm for input and output paths in Benes networks is disclosed in IEEE paper "Matrix-based Nonblocking Routing Algorithm for Benes Networks". In this, arbitrary permutation of the connection betweeninput and output ports is performed to determine conflict free paths for each and every input and output request. However this method is solely applied to a complete Benes network to determine the routing tags and the method to find control tags involve conflict resolution steps which may require revisiting previously selected control tags.
The present disclosure presents system and method of performing arbitrary bit permutation using a sequence of log2 (n) different permute processor instruction. The permute processor instructions include bit separation instructions (BSEP) , which separates selected bits to one side in order and separates unselected bits to the other side in reverse order; and bit distribution instructions (BDST), which distributes consecutive bits on one side to selected bit positions in order and distributes the rest of the sequence of bits from the other side to unselected bit positions in reverse source bit order in minimal number of steps.
Summary of the Invention
System and method for arbitrary bit permutation of a plurality of n-bits is disclosed. According to one embodiment, system and method for arbitrary bit permutation using BSEP and BDST instructions disclose procedures for finding specific bit patterns in control words. The control words are generated by a sequence of permute instructions for performing a desired n-bit permutation. The control words provided by the permute instructions become parameters to a sequence of said instructions executing on specialized processor hardware configured to run based onsaid instructions, which can be used insolving permutation problems in cryptography, multimedia and other applications.
The system for arbitrary bit permutation of a plurality of bits in a plurality of instances comprises a permute- enhanced computing processor and a permute control words generator. According to an embodiment, the permute-enhanced computing processor of the system for arbitrary bit permutation of a plurality of bits, there is provided BDST instruction operating on two source registers and one target registers. Both the source registers and the target registers is n-bit wide. According to the present embodiment, the plurality of n-bits is distributed to get the output in the target register. The BDST instruction distributes the sequence of bits from one part of one source register to selected bit positions in the target register in order and sequence of bits from the other part of the source register is moved to unselected bit positions in the target register in reverse order. The positions of the plurality of selected and unselected bit positions are indicated in a control word residing in the other source register. The control word is generated at each stage of the plurality of stages of bit permutation by setting bits in the control word at corresponding positions of the selected bits from an input sequence bits from the source register and clearing bits in the control word at corresponding positions of the unselected bits from the input. Furthermore, a memory subsystem associated with the permute-enhanced processor is provided with subroutine to perform arbitrary bit permutation comprising a sequence of log2(n) BDST instructions wherein the subroutine being parameterized by a plurality of log2(n) control words.
In another embodiment of the permute-enhanced processor of the system for arbitrary bit permutation of the plurality of bits, there is provided BSEP instruction operating on two source registers and one target registers. Both the source registers and the target registers is n-bit wide. According to the present embodiment, the plurality of n-bits is separated to get the output in the target register. The BSEP instruction separates selected bits from one source register to one part of the target register in order and unselected bits from said source register to the other part of the target register in reverse order. The selected and unselected bit positions are indicated in the control word residing in the other source register. The control word is generated at each stage of a plurality of log2 (n) stages by setting bits in the control word at corresponding positions of the selected bits from the input and clearing bits in the control word at corresponding positions of the unselected bits from the input. Furthermore, the memory subsystem associated with the permute-enhanced processor is provided with subroutine to perform arbitrary bit permutation comprising a seguence of log2(n) BSEP instructions wherein the subroutine being parameterized by log2(n) control words. The control words for the sequence of permute instruction to perform a desired n-bit permutation is provided by the permute control words generator. The permute control words generator generates the plurality of control words which parameterizes a sequence of log2(n) instances of the plurality of permute instructions. The plurality of log2(n) control words generated by the permute control words generator in a plurality of log2(n) stages is employed for the execution of the sequence of log2(n) instances of the plurality of permute instructions to obtain the desired permutation of the plurality of n-bits. The sequence of instructions is done in log2 (n) stages wherein the first stage takes as input the initial permutation and the last stage generates the ordered sequence, i.e., the identity permutation. Each intervening stage of the instructions takes the intermediate permutation output from the preceding stage as input. Furthermore, each stage generates control word as a result of selecting one half of preceding permutation bitsin stage-predetermined manner to one side of the resulting permutation in order and moving the other half of the unselected preceding permutation bits to the other side of the resulting permutation in reverse order.
For generating control words for a sequence of log2(n) BDST instructions, the initial permutation of the permute control words generator is set to the desired permutation and the resulting control words sequence are applied in reverse against the BDST instructions.
For generating control words for a sequence of log2(n) BSEP instructions, the initial permutation of the permute control words generator is set to the inverse of the desired permutation and the resulting control words sequence are applied in order against the BSEP instructions.
Brief Description of the Drawings
Other objects, features, and advantages of the invention will be apparent from the following description when read with reference to the accompanying drawings. In the drawings, wherein like reference numerals denote corresponding parts throughout the several views:
FIG. 1 illustrates a system for performing arbitrary bit permutation comprising permute-enhanced computing processor and permute control words generator;
FIG. 2 illustrates permute-enhanced computing processor enhanced with BSEP and BDST Permute unit;
FIG. 3 illustrates a flowchart for generating control words for a sequence of BDST instructions;
FIG. 4 illustrates a flowchart for generating different control words for a sequence of BSEP instructions; FIG. 5 exemplifies a BSEP operation in accordance with one embodiment of the present invention; FIG. 6 exemplifies a BDST operation in accordance with one embodiment of the present invention;
FIG. 7 illustrates a permutation P comprising a plurality of n-elements of N elements.
FIG. 8 illustrates a plurality of steps for a de-permutation of 16 elements using BSEP operation; FIG. 9 illustrates a plurality of steps to find an inverse permutation of 16 elements;
FIG. 10 exemplifies a de-permuting 16-bit inverse permutation steps; and
FIG. 11 exemplifies a permuting 16-bit configuration using BSEP operation permutation steps.
De-tailed Descrip-bion of the Preferred Embodiments
The present invention will now be described in detail with reference to the accompanying in drawings. The present invention provides a system and method to perform arbitrary bit permutation using a sequence of log2(n) bit separation (BSEP) instructions in one embodiment and a sequence of log2 (n) bit distribution (BDST) instructions in another embodiment, n being the number of bits to permute. FIG 1 illustrates the system (1103) for performing arbitrary bit permutation comprising a permute-enhanced computing processor (1101) and a permute control words generator (1102) . The permute-enhanced computing processor (1101) processes a plurality permute instructions for performing arbitrary permutation of a plurality of n-bits. FIG. 2 illustrates permute-enhanced computing processor (1101) enhanced with BSEP and BDST permute unit. The at least one memory subsystem unit (1201) stores a plurality of data and the plurality permute instructions of a bit permute program. At least one instruction fetch unit (1202) is employed for loading the plurality permute instructions of the program from the memory subsystem unit (1201) and a decoder unit (1203) decodes the plurality permute instructions of the program fetched by the at least one instruction fetch unit (1202) . The decoder unit (1203) controls a plurality of subsequent units during execution of the plurality permute instructions of the program. At least one register file (1205) associated with the decoder unit (1203) includes a plurality of n-bit registers for performing as a plurality of operands during execution of the plurality permute instructions of the bit permutation program. According to different embodiments of the present invention, the bit permutation program includes operations for either bit separation or bit distribution instructions. The system (1103) further includes at least one load-store unit (1204) for loading the plurality of operands into the at least one register file (1205) from the at least one memory subsystem unit (1201) and for storing a plurality of resulting operands in the at least one register file (1205) to the memory subsystem unit (1201). The permute-enhanced computing processor (1101) includes a permute unit (1208) for execution of the plurality of permute instructions for performing arbitrary permutation on the plurality of operands, wherein the plurality of operands includes a source operand to permute, a control word operand for identifying a first group of the source operand bits from a second group of the source operand bits and a target operand for storing a permute instruction result. The permute- enhanced computing processor (1101) further includes at least one subroutine in the memory subsystem unit (1201) . The at least one subroutine consisting of a sequence of logs(n) instances of the plurality of permute instructions parameterized by a plurality of control words generated by the permute control words generator (1102). The permute control words generator (1102) generates the plurality of log2 (n) control words in a plurality of log2(n) stages for execution of the sequence of log2 (n) instances of the plurality of permute instructions to obtain the desired permutation of the plurality of n-bits. A first stage among the plurality of log2(n) stages selects an initial permutation to be an input and each subsequent stage of the plurality of log2(n) stages generates an intermediate permutation output by selecting a plurality of predetermined bits from the input to a first side of the intermediate permutation output and by moving a sequence of unselected bits from the input to a second opposite side of the intermediate permutation output. Each stage except the first stage of the plurality of log2(n) stages selects the intermediate permutation output from a preceding stage to be the input. The control word is generated at each stage of the plurality of log2 (n) stages by setting bits in the control word at corresponding positions of the selected bits from the input and clearing bits in the control word at corresponding positions of the unselected bits from the input .
In one embodiment, the plurality of permute instructions to the permute-enhanced computing processor (1101) for performing permutation on the plurality of operands include a BDST instruction operating on two source registers and one target registers each register being n-bit wide. The BDST instruction distributes sequence of bits from one part of one source register to selected bit positions in the target register in order and sequence of bits from the other part of the said source register to unselected bit positions in the target register in reverse order. The positions of the plurality of selected and unselected bit positions are indicated in a control word residing in the other source register. The control word is generated at each stage of the plurality of stages of bit permutation by setting bits in the control word at corresponding positions of the selected bits from an input sequence bits from the source register and clearing bits in the control word at corresponding positions of the unselected bits from the input. Furthermore, a memory subsystem associated with the permute- enhanced processor is provided with subroutine to perform arbitrary bit permutation comprising a sequence of log2(n) BDST instructions wherein the subroutine being parameterized by a plurality of log2(n) control words.
In another embodiment, the plurality of permute instructions to the permute-enhanced computing processor (1101) for performing permutation on the plurality of operands include a, BSEP instruction operating on two source registers and one target registers, each register being n-bit wide. According to the present embodiment, the plurality of n-bits is separated to get the output in the target register. The BSEP instruction separates selected bits from one source register to one part of the target register in order and unselected bits from said source register to the other part of the target register in reverse order. The selected and unselected bit positions are indicated in the control word residing in the other source register. The control word is generated at each stage of a plurality of log2 (n) stages by setting bits in the control word at corresponding positions of the selected bits from the input and clearing bits in the control word at corresponding positions of the unselected bits from the input. Furthermore, the memory subsystem associated with the permute-enhanced processor is provided with subroutine to perform arbitrary bit permutation comprising a sequence of log2(n) BSEP instructions wherein the subroutine being parameterized by log2 (n) control words.
The permute-enhanced computing processor (1101) illustrated in FIG. 2 is enhanced with BSEP and BDST instruction implementation in the BSEP and BDST Permute unit (1208) . In one embodiment, the memory subsystem unit (1201) comprises subroutine comprising sequence of log2(n) BDST instructions and control word sequences for the BDST subroutine. In another embodiment, the memory subsystem unit (1201) comprises subroutine comprising sequence of log2(n) BSEP instructions and control word sequences for the BSEP subroutine. Furthermore, the memory subsystem unit (1201) comprises the source data to be permuted. The load-store unit (1204) loads source data to be permuted in addition to the control word sequence for the desired permutation into the register file (1205). Once a permute instruction in the sequence of permute instructions is loaded and decoded for execution by the instruction fetch unit (1202) and decoder unit (1203), respectively, the source register to be permuted and register with corresponding control word is loaded into the BSEP and BDST Permute unit (1208), the decoded permute instruction is executed and the resulting intermediate bit permutation is stored into the target register in the register file (1205). The target register will become the source register to be permuted for the next permute instruction in the sequence of permute instructions. The permute-enhanced computing processor (1101) includes an ALU (Arithmetic Logic Unit) (1207) common in computing processor to support the processing of other instructions like arithmetic and bit-wise operations.
FIG. 3 depicts flowchart for generating control words for a sequence of log2 (n) BDST instructions, the initial incoming permutation (104) of the permute control words generator is set to the determined permutation (102) and the resulting control words sequence are applied in reverse against the BDST instructions (118) . If a stage is not the last stage (114), the stage comprises of selecting half of the incoming permutation bits in stage-predetermined manner to one part of the resulting permutation in order (106), moving unselected bits to the other part of said resulting permutation in reverse order (108), setting corresponding selected bits to 1 and unselected bits to 0 in the control word for the stage (110) and feeding the resulting permutation as the incoming permutation for the next stage (112) . For the last stage, set bits in the control word for the last stage corresponding to first half of source bit positions in the incoming permutation to 1 and set the other bits in the control word to 0 (116) .
FIG. 4 depicts flowchart for generating control words for a sequence of log2 (n) BSEP instructions, the initial incoming permutation (122) of the permute control words generator is set to the inverse of the determined permutation (120) and the resulting control words sequence are applied in order against the BSEP instructions (136) . If a stage is not the last stage (132), the stage comprises selecting half of the incoming permutation bits in stage-predetermined manner to one part of the resulting permutation in order (124), moving unselected bits to the other part of said resulting permutation in reverse order (126), setting corresponding selected bits to 1 and unselected bits to 0 in the control word for the stage (128) and feeding the resulting permutation as the incoming permutation for the next stage (130) . For the last stage, set bits in the control word for the last stage corresponding to first half of source bit positions in the incoming permutation to 1 and set the other bits in the control word to 0 (134) . FIG. 5 exemplifies a BSEP operation in the permute-enhanced computing processor (1101), in accordance with one embodiment of the present invention. In this example, the BSEP operation is an 8-bit operation comprising three register operands: a source register (204), a result register (202) and a mask register (206) . The bits, i.e. bi, b3, b and b7, of the source register (204) is distributed in normal order to one sideof the result register (202) that are marked with 1-bits in the control wordregister (206) and the remaining bits i.e. bo, b2, b5 and bg, of the source register (204) is distributed in reverse order to the other side of the result register (202) marked with 0-bits in the control word register (206) .
FIG. 6 exemplifies a BDST operation in accordance with one embodiment of the present invention. The operation is the reverse of the BSEP operation. BDST operation distributes consecutive bits in result register (202) on one side as indicated by the l's in the mask register (206) in source bit order and the rest of the bits on the other side as indicated by the 0's in the mask register (206) in reverse source bit order. These instructions are useful for some easily identifiable permutations. For example, bit order reversal would require a single instruction. In addition, to separate or distribute bits using the same source bit order for both groups, it may be doneusing two BSEP instructions or two BDST instructions, respectively. However, to be truly useful, a method from embodiments of the present invention is required by which a sequence of log2(n) of permute instructions is used to perform an arbitrary permutation which are used extensively in coding techniques, for example, convolutional coding and convolut ional turbo coding.
The control words for the sequence of permute instruction to perform a desired n-bit permutation is provided by the permute control words generator (1102) further comprising log2(n) stages wherein the first stage takes as input the initial permutation and the last stage generates the ordered sequence, i.e., the identity permutation. Each intervening stage takes as input the intermediate permutation output from the preceding stage. Furthermore, each stage generates control word as a result of selecting one half of preceding permutation bits in stage-predetermined manner to one side of the resulting permutation in order and moving the other half of the unselected preceding permutation bits to the other side of the resulting permutation in reverse order. Stage-predetermined selection for stage i, except the last stage, of one half of the n bits to move to one part of the resulting permutation in order is given by the following bit positions in base-2 numbers and
Figure imgf000022_0001
The other one half of unselected bit positions is moved to the other part of the resulting permutation in reverse order.
A method of bit permutation against some arbitrary permutation using a general 2-way split operation in log2(n) steps is described in FIG. 7, where n is the number of elements in the permutation. By mapping the final sequence to the ordered set of consecutive integers, a systematic method comprising steps to bring any permutation P to the ordered set, or the identity permutation I, is determined based on BSEP instructions. Consequently, by going through the steps in reverse, BDST operation may be performed.
FIG. 7 illustrates a permutation P comprising of Nelements. Each step consists of simple intermediate permutation wherein step-specific N/2 elements from the instructions before are grouped into the first half of the intermediate permutation result for the step. For step 0, N/2 elements from N elements of P are grouped into the first N/2 positions. For step 1, N/4 elements are taken from the first N/2 positions and grouped into the first N/4 positions. The next N/4 elements are taken from the second N/2 positions and grouped into the second N/4 position. Similar subdivision and operation is performed until the last step. For the last step, 1 element is taken from each pair from the previous step and placed in the first N/2 positions, in order. There are log2(N) steps. In general, for step i, i =
0,..., log2(N)-l, for every consecutive -tuple from previous
Figure imgf000023_0001
step, pick half of its elements to make a new (~) -tuple.
Then arrange these new consecutively in the first
Figure imgf000023_0002
half of the resulting intermediate permutation for the step. For clarity, the steps are depicted in FIG. 5. It is applicable to any operation that may perform 2-way split in some manner. It should be clear to a person skilled in the art that the steps could be extended to using operation that performs any k-way split operation.
By going through the instructions and the methods, it is clear that atotally arbitrary permutation is changed to a very specific permutation. By labelling the elements of the resulting specific permutation using sequence of consecutive integers of ordered set, and moving backward through the, one would be able to find the specific elements that need to be selected in all the way up to the original arbitrary permutation .
Now, the above stated computation is performed with steps of BSEP operation. First, in the last instruction, the elements of the final permutation are labelled using the sequence (0, 1, 2,..., N-l) . This sequenceis also called the identity permutation, I. Due to BSEP property wherein selected elements are arranged into the first half in source order and unselected elements are arranged into the second half in reverse source order, the prior permutation would be:
({0, N - 1}, {1, N - 2} {2, N - 3} { - 1, })
Each 2-tuple actually comprises an integer and its bitwise complement. For convenience, notation <...>is used to denote a set of integers comprising those explicitly specified in the angle brackets and their bitwise complements. The above could be written as permutation,
(<0>, <1>,...,< -2>, < "1>)
Now consider the second last instruction, which would generate the intermediate permutation above. But by choosing the complements to be the values shown at the tail of the permutation, the permutation could also be written as:
(<0>, <!>,...,<-+!>, <->) The prior permutation that generates (<0>, <1>, <-+l>, <~>) over BSEP operation would be found out from following. Each pair from the first half of the resulting permutation comes from two elements in each 4-tuple from the preceding permutation. Due to BSEP property, the preceding permutation would need to be
(<o. >.<i.:+i> <ϊ-φ-2>.<φ-ι.φ-ι>)
The above permutation is writtenin matrix form below for clarity and using alternate display values at the tail as:
Figure imgf000025_0001
To gain more insight, consider the third last instruction before deducing the intermediate permutation for each step by process of induction. Again due to BSEP property, the preceding permutation would need to be:
Figure imgf000025_0002
The table below summarizes the analysis so far and include formulation of the element selection based on the step index
Figure imgf000026_0001
Table 1
The general operation for each step is described as:
For instruction i, i=0, ... , log2 (n) -2 , from prior intermediate permutation, select elements as identified
N
below, arranged in^j^- 2imatrix for clarity and transposed from the analysis above, using their original source positions and elements as identified by the bitwise complements of the
10 same source positions and place the elements in the same relative order into the first half of the resulting intermediate permutation for this step.
L/V-
Figure imgf000026_0002
Place the rest of the unselected elements from prior permutation into the other half of the resulting
15 intermediate permutation in reverse relative order. FIG. 8 to FIG. 11 illustrates, the stages in the control words generation are listed in performing arbitrary 16-bit permutation .
In stage 0, the processor picks elements {0, 4, 8, 12, 15, 11, 7, 3} from the original permutation into the first half of the resulting permutation preserving relative order.
In stage 1, the processor picks elements {0,1,8,9,15,14,7,6} from the preceding permutation into the first half of the resulting permutation preserving relative order.
In stage 2, the processor picks elements
{0,1,2,3,15,14,13,12} from the preceding permutation into the first half of the resulting permutation preserving relative order.
Under stage 3, the processor picks elements {0,1,2,3,4,5,6,7} from the preceding permutation into the first half of the resulting permutation preserving relative order. The final ordered sequence obtained is {0, 1, 2,..., 15} . FIG.8 shows how the stages above are used to de-permute permutation (12,10,3,15,1,0,9,5,11,7,4,6,2,13,8,14) using the prescribed method. The control word for each stage can be deduced from FIG.8. In FIG.8, for each stage, the highlighted elements in the permutation are assigned bit value '1' and the other elements are assigned bit value 'Ο' in the control word. In FIG.8, the least significant bit (lsb) is chosen to start from the left. In programming source code, values are typically written with the lsb from the right. Once the control bits are obtained, the actual
16-bit permutation can be performed using BDST using the steps in reverse direction. The pseudo code to perform the
16-bit permutation example is shown below. control [4] = {
0bl0100110_01011010,
0bll001100_10010011,
0b01101001_10101100 ,
0b01000111_00101101
} ;
functi on Exampl elPermutel6(x) {
for (1 = 0 ... 3) {
x = BDST(x , cont rol [i ] ) ;
}
retu rn x ;
These control words are then used in reverse steps using BDST to perform the actual permutation.
The same method can be used to perform a permutation directly. The use of steps of BSEP can be summarizedas a mapping from an arbitrary permutation P to identity permutation, i.e., ordered sequence, I. The mapping is denoted as:
BSEPk
P ►/
The steps of BSEP can directly be usedto perform any permutation (let it be permutation Q) , that is represented in this mapping: BSEPk
I
Given permutation Q, applying its inverse permutation on permutation Q would result in the identity permutation I, i.e., Q"1 Q =1. Multiplying Q"1 to both side of the mapping above will get,
BSEPk
Q
Figure imgf000029_0001
To use steps of BSEP directly to perform permutation Q, one can apply the earlier instructions of BSEP now on the inverse of Q, i.e., Q'1. The control word on every instruction can be deduced and can use the control words with BSEP instruction.
FIG.9 shows an illustration for finding inverse permutation. Consider previous example permutation P =
(12,10,3,15,1,0,9,5,11,7,4,6,2,13,8,14) . Finding the inverse of a permutation is straight forward. For example, the elements could be tagged with their positions and sorted using the element labels as keys. The resulting permutation of tags would be the inverse permutation. The inverse of the example permutation is therefore P-1 =(5,4,12,2,10,7,11,9,14,6,1,8,0,13,15,3) . The same method is employed by using steps of BSEP to de-permute permutation P_1 as shown in FIG.8. Applying the steps and control configuration as derived from
FIG.10 on the identity permutation I as depicted in FIG.11.
It is clear that from the FIG.11 that the resulting permutation at the end is the permutation P. The pseudo code for implementing the example permutation using steps of BSEP operation is shown below. control [4] = {
0bll011000_01100110,
0b00011110_01110100 ,
0b01011010_01011100 ,
0bl0101001_01010101
};
function Example2Permutel6(x) {
for (I = 0 ... 3) {
x = BSEP(x, control [i]);
}
return x;
As will be readily apparent to those skilled in the art, the present invention may easily be produced in other specific forms without departing from its essential characteristics. The present embodiments is, therefore, to be considered as merely illustrative and not restrictive, the scope of the invention being indicated by the claims rather than the foregoing description, and all changes which come within therefore intended to be embraced therein.

Claims

Claims
1 A computer implemented system (1103) for performing a desired permutation of a plurality of n-bits in a plurality of instances, the computer implemented system (1103) comprising :
a permute-enhanced computing processor (1101) for processing a plurality of permute instructions for performing arbitrary permutation of the plurality of n-bits; at least one memory subsystem unit (1201) for storing a plurality of data and the plurality permute instructions of a program for performing arbitrary permutation of the plurality of n-bits;
at least one instruction fetch unit (1202) for fetching the plurality permute instructions of the program from the memory subsystem unit (1201);
wherein the permute-enhanced computing processor (1101) further includes at least one subroutine stored in the memory subsystem unit (1201), the at least one subroutine comprising of a sequence of log2(n) instances of the plurality of permute instructions parameterized by a plurality of control words generated by a permute control words generator (1102);
at least one decoder unit (1203) for decoding the plurality permute instructions of the program fetched by the at least one instruction fetch unit (1202), wherein the at least one decoder unit (1203) controls a plurality of subsequent units during execution of the plurality permute instructions of the program;
at least one register file (1205) comprising a plurality of n-bit registers for performing a plurality of operands during execution of the plurality permute instructions of the program;
at least one load-store unit (1204) for loading the plurality of operands into the at least one register file (1205) from the at least one memory subsystem unit (1201) and for storing a plurality of resulting operands in the at least one register file (1205) to the at least one memory subsystem unit (1201);
characterised in that the permute control words generator (1102) of the computer implemented system (1103) generates the plurality of log2(n) control words in a plurality of log2 (n) stages for execution of the sequence of log2(n) instances of the plurality of permute instructions to obtain the desired permutation of the plurality of n-bits; and the computer implemented system (1103) includes a permute unit (1208) for execution of the plurality of permute instructions for performing arbitrary permutation on the plurality of operands, wherein the plurality of operands includes a source operand to permute, a control word operand for identifying a first group of the source operand bits from a second group of the source operand bits and a target operand for storing a permute instruction result.
2 The computer implemented system (1103) in claim 1 wherein the plurality of permute instructions includes a bit distribution (BDST) instruction operating on the plurality of operands to distribute a sequence of bits in relative bit order from the first group of the source operand to selected bit positions in the target operand identified by the control word operand and a sequence of bits in reverse relative bit order from the second group of the said source operand to unselected bit positions in the target operand identified by the control word operand. 3 The computer implemented system (1103) in claim 1 wherein the plurality of permute instructions includes a bit separation (BSEP) instruction operating on the plurality of operands to separate the sequence of selected bits from the source operand to one part of the target operand in relative bit order and unselected bits from said source operand to the other part of the target operand in reverse relative bit order .
4 The computer implemented system (1103) in claim 1 wherein each stage in the permute control words generator (1102) selects a plurality of predetermined bits to one part of a permutation output i relative bit order and moves unselected bits to the othe part of the permutation output in reverse relative order. 5 The computer implemented system (1103) in claim 1 wherein an initial permutation input to the first stage among the plurality of log2(n) stages of the permute control words generator (1102) is the desired permutation and a generated sequence of control words is applied in reverse order on a sequence of log2(n) instances of BDST instruction.
6 The computer implemented system (1103) in claim 1 wherein the initial permutation input to the first stage among the plurality of log2 (n) stages of the permute control words generator (1102) is the inverse of the desired permutation and the generated sequence of control words is applied in same order on a sequence of log2(n) instances of BSEP instruction. 7 A method for generating a sequence of log2(n) control words for obtaining a desired permutation on an input of a plurality of n-bits, the method having the plurality of log2(n) stages comprising:
selecting the input to be an initial permutation in a first stage among the plurality of log2(n) stages; generating an intermediate permutation output in each stage of the plurality of log2(n) stages by selecting a plurality of predetermined bits from the input to a first side of the intermediate permutation output and by moving a plurality of unselected bits from the input to a second side of the intermediate permutation output;
selecting the intermediate permutation output from a preceding stage in each stage except the first stage of the plurality of log2(n) stages, the intermediate permutation output of the preceding stage is selected to be the input; and
generating a control word at each stage of the plurality of log2(n) stages by setting bits in the control word at corresponding positions of the selected bits from the input and clearing bits in the control word at corresponding positions of the unselected bits from the input;
wherein the seguence of log2(n) control words being applied on a sequence of log2(n) instances of the permute instruction in the plurality of log2(n) stages for obtaining the desired permutation of the input having the plurality of n-bits .
8 The method in claim 7 wherein the permute instruction is a bit distribution (BDST) instruction operating on plurality of operands, the BDST instruction distributes sequence of bits from a first side of the source operand to selected bit positions in the target operand in relative bit order and sequence of bits from a second side of said source operand to unselected bit positions in the target operand in reverse relative bit order wherein the control word operand specifies about the selected and unselected bit positions.
9 The method in claim 7 wherein the permute instruction is a bit separation (BSEP) instruction operating on plurality of operands, the BSEP instruction separates selected bits from the source operand to the first side of the target operand in relative bit order and unselected bits from said source operand to the second side of the target operand in reverse relative bit order wherein the control word operand indicates the selected and unselected bit positions.
10 The method in claim 7 wherein generating the intermediate output permutation includes a process of selecting predetermined bits to one part of the permutation output in relative bit order and moving unselected bits to the other part of the permutation output in reverse relative order .
PCT/MY2015/000006 2014-01-22 2015-01-22 System and method for arbitrary bit permutation using bit-separation and bit-distribution instructions WO2015112001A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI2014000196 2014-01-22
MYPI2014000196A MY172620A (en) 2014-01-22 2014-01-22 System and method for arbitrary bit pemutation using bit-separation and bit-distribution instructions

Publications (1)

Publication Number Publication Date
WO2015112001A1 true WO2015112001A1 (en) 2015-07-30

Family

ID=52727346

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/MY2015/000006 WO2015112001A1 (en) 2014-01-22 2015-01-22 System and method for arbitrary bit permutation using bit-separation and bit-distribution instructions

Country Status (2)

Country Link
MY (1) MY172620A (en)
WO (1) WO2015112001A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020078011A1 (en) * 2000-05-05 2002-06-20 Lee Ruby B. Method and system for performing permutations with bit permutation instructions
US20020108030A1 (en) * 2000-05-05 2002-08-08 Lee Ruby B. Method and system for performing permutations using permutation instructions based on modified omega and flip stages
WO2013009162A1 (en) * 2011-07-12 2013-01-17 Mimos Berhad Method of providing signals to perform bit permutation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020078011A1 (en) * 2000-05-05 2002-06-20 Lee Ruby B. Method and system for performing permutations with bit permutation instructions
US20020108030A1 (en) * 2000-05-05 2002-08-08 Lee Ruby B. Method and system for performing permutations using permutation instructions based on modified omega and flip stages
US6952478B2 (en) 2000-05-05 2005-10-04 Teleputers, Llc Method and system for performing permutations using permutation instructions based on modified omega and flip stages
US7174014B2 (en) 2000-05-05 2007-02-06 Teleputers, Llc Method and system for performing permutations with bit permutation instructions
WO2013009162A1 (en) * 2011-07-12 2013-01-17 Mimos Berhad Method of providing signals to perform bit permutation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHIJIE JERRY SHI ET AL: "Bit permutation instructions: Architecture, implementation, and cryptographic properties", 1 January 2004 (2004-01-01), XP055192775, ISBN: 978-0-49-667802-0, Retrieved from the Internet <URL:http://ddod.riss.kr/ddodservice/search/viewDetailThesisInfoForm.jsp?p_no=10593807&p_abstract_yn=Y&p_toc_yn=N&p_fulltext_kind=002&p_search=&p_k2dockey> *
ZHIJIE SHI ET AL: "Bit permutation instructions for accelerating software cryptography", APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES, AND PROCESSORS, 2000. PRO CEEDINGS. IEEE INTERNATIONAL CONFERENCE ON JULY 10-12, 2000, PISCATAWAY, NJ, USA,IEEE, 10 July 2000 (2000-07-10), pages 138 - 148, XP010507744, ISBN: 978-0-7695-0716-3, DOI: 10.1109/ASAP.2000.862385 *

Also Published As

Publication number Publication date
MY172620A (en) 2019-12-06

Similar Documents

Publication Publication Date Title
US6952478B2 (en) Method and system for performing permutations using permutation instructions based on modified omega and flip stages
US7406174B2 (en) System and method for n-dimensional encryption
US7685408B2 (en) Methods and apparatus for extracting bits of a source register based on a mask and right justifying the bits into a target register
US8050401B2 (en) High speed configurable cryptographic architecture
US8787563B2 (en) Data converter, data conversion method and program
WO2001089098A2 (en) A method and system for performing permutations with bit permutation instructions
US10833847B2 (en) Cryptographic hash generated using data parallel instructions
US20160171249A1 (en) Decryption Systems And Related Methods For On-The-Fly Decryption Within Integrated Circuits
CN111464308A (en) Method and system for realizing reconstruction of multiple Hash algorithms
WO2006011957A1 (en) Apparatus and method for performing md5 digesting
JP2014038640A (en) Improved pipelined digital signal processor
CN116318660B (en) Message expansion and compression method and related device
CN116722967A (en) Lightweight joint coding password implementation method and system
US6865272B2 (en) Executing permutations
WO2015112001A1 (en) System and method for arbitrary bit permutation using bit-separation and bit-distribution instructions
US7181009B1 (en) Generating message digests according to multiple hashing procedures
EP1649634B1 (en) Method and apparatus for fast rc4-like encryption
Cabral et al. Implementation of the SHA-3 family using AVX512 instructions
Sreekanth et al. Implementation of area-efficient AES using FPGA for IOT applications
Hilewitz Advanced bit manipulation instructions: architecture, implementation and applications
Maxwell Wheesht: an AEAD stream cipher
Hilewitz et al. Accelerating the whirlpool hash function using parallel table lookup and fast cyclical permutation
Aoki et al. Byte Slicing Grøstl: Improved Intel AES-NI and Vector-Permute Implementations of the SHA-3 Finalist Grøstl
JP4004805B2 (en) Non-linear conversion device, non-linear conversion method of non-linear conversion device, and non-linear conversion program
CN114626537B (en) Irreducible polynomial and quantum secure hash value calculation method based on x86 platform SIMD

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15711893

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15711893

Country of ref document: EP

Kind code of ref document: A1