WO2024007506A1 - 一种通用轻量哈希处理方法、系统及可储存介质 - Google Patents

一种通用轻量哈希处理方法、系统及可储存介质 Download PDF

Info

Publication number
WO2024007506A1
WO2024007506A1 PCT/CN2022/131905 CN2022131905W WO2024007506A1 WO 2024007506 A1 WO2024007506 A1 WO 2024007506A1 CN 2022131905 W CN2022131905 W CN 2022131905W WO 2024007506 A1 WO2024007506 A1 WO 2024007506A1
Authority
WO
WIPO (PCT)
Prior art keywords
hash
algorithm
data
update
register
Prior art date
Application number
PCT/CN2022/131905
Other languages
English (en)
French (fr)
Inventor
郑建良
郑奕
Original Assignee
广西伯汉科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广西伯汉科技有限公司 filed Critical 广西伯汉科技有限公司
Priority to US18/348,872 priority Critical patent/US20240007269A1/en
Publication of WO2024007506A1 publication Critical patent/WO2024007506A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0643Hash functions, e.g. MD5, SHA, HMAC or f9 MAC
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/20Manipulating the length of blocks of bits, e.g. padding or block truncation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/30Compression, e.g. Merkle-Damgard construction

Definitions

  • the present invention relates to the technical field of information technology, and in particular, to a universal lightweight hash processing method, system and storage medium.
  • Hash functions compress data from an arbitrary size domain into a fixed size range.
  • the output of the function is called the hash value of the input data, or simply hash.
  • the phenomenon that multiple different input data produce the same hash value is called a collision.
  • collision-resistant does not mean that there is no collision, because for a compression function In theory, collision is inevitable.
  • MD structure Structure
  • SHA-1 SHA-1
  • SHA-2 SHA-2
  • the internal state used by the hash function based on the MD structure is the same size as the final hash value, so it is more vulnerable to attacks.
  • the security of the MD structure creates big problems.
  • Current methods to solve hash security include modifying the MD structure to produce more secure MD variant algorithms and developing new algorithms based on non-MD structures, such as wide tube structures and new sponge function-based algorithms. algorithm, however wide tube structures generally degrade performance.
  • algorithms based on sponge functions are considered to be relatively advanced and secure algorithms, including the latest SHA-3 standard, which is also designed based on sponge functions.
  • some studies have pointed out that algorithms based on sponge functions may suffer from sliding attacks.
  • SHA-3 is more efficient than SHA-2, it is still not lightweight enough and its speed is slower than MD5 and SHA-1.
  • the technical problems solved by the present invention are: the existing hash processing algorithm has complex structure, low performance, vulnerability to attack, and low security.
  • the technical problems solved by the present invention are: the existing hash processing algorithms have complex structures, low performance, vulnerability to attacks, and low security.
  • the present invention provides a universal lightweight hash processing method, which is applied to a universal lightweight hash processing system.
  • the method includes: selecting a linear feedback shift register and using a register.
  • the state transition function performs state transition on the linear feedback shift register; uses the output of the linear feedback shift register, combined with the initialization algorithm to initialize the internal state of the hash algorithm; inputs the data that needs to calculate the hash value, and uses the update
  • the algorithm updates the initialized internal state, and the data that needs to be calculated as a hash value is filled data; a determination algorithm is used to post-process the updated internal state to generate a final hash value.
  • the i-th row S i of the internal matrix M [a i , bi , c i , di ] is directly set to the four register state values starting from the i ⁇ 68 state, and the switching mask m is used as the first status value of the register.
  • the s, t and x are also initialized. The s and t are set to (a 0 +b 0 +c 0 +d 0 ), and x is is set to 0.
  • the update algorithm is expressed as:
  • the update algorithm uses each 64-bit word of the input data to update the internal state, and the input data is filled to contain one or more The length of the complete 64-bit word.
  • the input of the update function is a 64-bit status word in the register, a 64-bit word of the padded data and the current internal state.
  • the output of the update function is each The new internal state returned by the update function call.
  • the data filling method used for the filled data includes free suffix filling and free prefix filling.
  • the free prefix filling is:
  • Data population consists of the following steps:
  • Step one simply append enough 0s (no need to append 0s if the original data contains exactly one or more complete 64-bit words),
  • Step 2 Append a 64-bit word w, which is calculated from the data length (number of bytes contained in the data) z and word s, that is:
  • Hash value H h 1 h 2 ...h n
  • >>> is the "rotate right" bit operation.
  • the update algorithm and the determination algorithm are designed as a two-layer structure, in which the upper layer is the update algorithm and the lower layer is the determination algorithm.
  • a general lightweight hash processing system includes: a processor, a network module and a memory; wherein, the processor and the memory communicate through the network module, and the processor Reading a computer program from said memory and running it;
  • the state conversion module is used to: perform state conversion of the register, using the register state conversion function as a conversion tool, and the linear feedback shift register is selected as the register;
  • the initialization module is used to: initialize the internal state of the hash algorithm, according to The output of the register in the state conversion module, and uses the initialization algorithm as a tool for initialization;
  • the update module is used to: use the output of the state conversion module and the initialization module, combined with the filling data, use the update algorithm to update the initialized The internal state of the hash algorithm;
  • the determination module is used to: perform post-processing on the updated internal state to generate a final hash value.
  • a computer-readable storage medium stores at least one instruction, at least one program, a code set or an instruction set.
  • the at least one instruction, the at least one program, the code A set or set of instructions is loaded and executed by the processor to implement the general lightweight hashing method described above.
  • the universal lightweight hash processing method, system and storage medium use a balanced maximum length linear feedback shift register to independently calculate each hash word using a single algorithm to solve the problem
  • the speed and security issues of hash processing are eliminated, and the universal lightweight hash processing method of this application is simple to implement, can flexibly adjust the length and starting point, is conducive to parallel and fast calculations, and requires small storage space.
  • Hash words can be dynamically calculated to eliminate collisions and improve the security of the hash processing method.
  • Figure 1 is a block diagram of a general lightweight hash processing system according to the embodiment of the present application.
  • Figure 2 is a schematic flow chart of a general lightweight hash processing method according to the embodiment of the present application.
  • Figure 3 is a schematic diagram of the overall structure of a general lightweight hash processing method according to the embodiment of the present application.
  • Figure 4 is a schematic diagram of the Galois linear feedback shift register described in the embodiment of the present application.
  • Figure 5 is a schematic structural diagram of a general lightweight hash processing method for calculating the hash value of structured packet data according to the embodiment of the present application
  • Figure 6 is a block flow chart of a universal lightweight hash processing system processor according to the embodiment of the present application.
  • references herein to "one embodiment” or “an embodiment” refers to a specific feature, structure, or characteristic that may be included in at least one implementation of the present invention. "In one embodiment” appearing in different places in this specification does not all refer to the same embodiment, nor is it a separate or selective embodiment that is mutually exclusive with other embodiments.
  • connection should be understood in a broad sense.
  • it can be a fixed connection, a detachable connection, or an integrated connection; it can also be a mechanical connection, an electrical connection, or a direct connection.
  • a connection can also be indirectly connected through an intermediary, or it can be an internal connection between two components.
  • the specific meanings of the above terms in the present invention can be understood on a case-by-case basis.
  • FIG. 1 shows a block diagram of a general lightweight hash processing system provided by an embodiment of the present application.
  • the universal lightweight hash processing system in the embodiment of the present application can be a server with data storage, transmission and processing functions.
  • the universal lightweight hash processing system 100 includes: a processor 110 and a network module 120 and memory 130.
  • the memory 110, the processor 120 and the network module 130 are electrically connected directly or indirectly to realize data transmission or interaction. For example, these components can be electrically connected to each other through one or more communication buses or signal lines.
  • the memory 110 stores data generated during processing by the processor 120.
  • the processor 120 executes various functional applications and general lightweight hash processing by running software programs and modules stored in the memory 110.
  • the memory 110 can be, but is not limited to, random access memory (Random Access Memory, RAMD), read only memory (Read Only Memory, ROMD, programmable read-only memory (PROMD), erasable memory). Read-only memory (Erasable Programmable Read-only Memory, EPROMND, Electric Erasable Programmable Read-only Memory, EEPROMD, etc.).
  • RAMD Random Access Memory
  • ROMD Read Only Memory
  • PROMD programmable read-only memory
  • erasable memory Read-only memory
  • EPROMND Erasable Programmable Read-only Memory
  • EEPROMD Electrical Erasable Programmable Read-only Memory
  • the network module 130 is used to establish a communication connection between the processor 120 and other communication terminal devices through the network to implement the sending and receiving operations of network signals and data.
  • the above-mentioned network signals may include wireless signals or wired signals.
  • FIG. 1 is only illustrative, and the universal lightweight hash processing system 100 may also include more or fewer components than shown in FIG. 1 , or have a different configuration than that shown in FIG. 1 .
  • Each component shown in Figure 1 can be implemented in hardware, software, or a combination thereof.
  • Embodiments of the present application also provide a computer-storable medium.
  • the computer-storable medium stores a computer program, and the computer program implements the above method when running.
  • FIGs 2 to 4 are schematic diagrams of a general lightweight hash processing method provided by this embodiment of the application.
  • the method steps defined by the method-related process are applied to the universal lightweight hash processing system 100 and can be implemented by the processor 120.
  • the method includes the following steps S1 to S4:
  • the 64-bit bits of the register are numbered in sequence, from high to low, 64, 63,...,1. Each bit can take a binary value of 0 or 1.
  • the current status of the register is 0110 1110...1100 1001 .
  • the bits that affect the next state are called taps.
  • the tap sequence in the figure is [64,61,59,57,...,8,6,5,3], which can be equivalently used as the characteristic polynomial x 64 +x 61 +x 59 +x 57 +...+x 8 +x 6 +x 5 +x 3 +1 to express.
  • the constant "1" in the polynomial does not represent a certain tap, it refers to the input of one bit, the rightmost bit is the output bit, and the sequence generated is called the output stream.
  • the tap sequence is represented by the value of the switching mask.
  • the tap sequence [64,61,59,57,..., 8,6,5,3] corresponds to the binary value 1001 0101...1011 0101 or hexadecimal
  • the system value is 0x95...b5 (the prefix 0x here means that it is followed by a hexadecimal value), that is, the switching mask is 0x95...b5.
  • the next state can be calculated from the current state of the register, for example: Assuming that the current state is y and the switching mask is m, the state update can be performed by the following simple calculation:
  • & is the "AND” bit operation
  • >> is the “logical right shift” bit operation
  • It is an “XOR” bit operation.
  • the final y value is the new state.
  • the initial state of the register cannot be zero, otherwise its state will not change and will always be zero.
  • the hash value obtained is a variable-length hash value, and each hash value contains one or more 64-bit unsigned integers. , for convenience, in the following description we will refer to the 64-bit unsigned integer as a 64-bit word or simply as a word. Except for the initialization stage, each word of the hash value is independent of each other and calculated independently.
  • the first set S 0 participates in calculating each hash word, and the other sets S 1 , S 2 ,..., S n are only used to calculate the corresponding hash words h 1 , h 2 ,..., h n .
  • the formula of the initialization algorithm is expressed as:
  • the words s and t are set to (a 0 +b 0 +c 0 +d 0 ), and x is set to 0.
  • the initialization process ensures that all words a 0 , b 0 , c 0 , d 0 , a 1 , b 1 , c 1 , d 1 ,... are non-zero and different, and any two sets S i and S j (i ⁇ j) does not overlap, that is, any word in one set is more than 64 register states away from any word in another set, for example: the first word a 1 in set S 1 is the same as the last word a 1 in set S 0 Word d 0 is 65 register states away.
  • S3 Input the data for which the hash value needs to be calculated, and use the update algorithm to update the initialized internal state.
  • the data for which the hash value needs to be calculated is the filled data.
  • the update algorithm is expressed as:
  • the update algorithm updates the internal state using each 64-bit word of the input data.
  • the input data is padded to a length containing one or more complete 64-bit words.
  • the input to the update function is a 64-bit status word in the register, as described A 64-bit word of padded data and the current internal state.
  • the output of the update function is the new internal state returned by each update function call.
  • the data filling methods used include free suffix filling and free prefix filling.
  • free prefix filling as an example, the above process of filling data includes two steps.
  • Step one simply append enough 0s (if the original data happens to contain one or more complete 64-bit words, there is no need to append 0s);
  • Step 2 Append a 64-bit word w, which is calculated from the data length (number of bytes contained in the data) z and word s, that is:
  • the second step above calls the state transition function once, thereby consuming an additional linear register internal state.
  • the last word of the filled data is set to where z is the number of bytes contained in the unfilled data, which consumes an additional word from the register, in effect creating a gap between the last two register inputs Sk -1 and Sk+1 , and this gap is the key to achieving free prefix filling.
  • the values of the four words a 0 , b 0 , c 0 and d 0 in the first set S 0 remain distinct throughout the update process, and the four words in any other set S i (i>0)
  • the values of a i , b i , c i and d i are not guaranteed to be different in the update step, but any two words, such as a i and b i cannot be equal twice in a row, because once they are equal, the next update will They will perform exclusive OR operations with two words a 0 and b 0 whose values are not equal, and the resulting operation results must not be equal.
  • the internal status word s can be regarded as a special counter. It does not increase by 1 each time like an ordinary counter, but only ensures that it takes a different value each time. This word actually acts as a special counter in addition to the normal counter.
  • the second (automatically generated) input in addition to the data input has the role of uniquely identifying each normal data word. It enhances the security of the hash algorithm and is the key to giving the algorithm the free prefix filling property.
  • S4 The general lightweight hash processing system uses a deterministic algorithm to post-process the updated internal state to generate the final hash value.
  • the determination algorithm is expressed as:
  • Hash value H h 1 h 2 ...h n
  • >>> is the "rotate right" bit operation.
  • the update algorithm and the determination algorithm are designed with a two-layer structure, in which the upper layer is the update algorithm and the lower layer is the determination algorithm.
  • the upper layer uses an update algorithm to absorb data and update the internal state in the process. This layer uses a simple, efficient and parallel-operated algorithm to achieve high-speed operation.
  • the lower layer uses a deterministic algorithm to fully mix the internal state after absorbing the data, and then compresses it with a high compression ratio to generate the final hash value, thus enhancing the collision resistance and one-way performance of the function.
  • the upper and lower layers use different compression functions, which not only prevents sliding attacks, but also optimizes each layer separately for different requirements.
  • the initialized word s is loaded into the linear feedback shift register, and then the word s is continuously updated.
  • the update algorithm needs to be called repeatedly.
  • the word s is only updated by the state transition function and is not affected by any input data. It is not considered part of the internal state. Words t and x are affected by both the input data and the word s, and are part of the internal state.
  • the hash value calculated by this application can start from any byte of any hash word or end at any byte.
  • B ij represent the j-th byte of the i-th hash word h i
  • the number of discarded bytes at the beginning (such as 11, or b in hexadecimal) will be displayed as a prefix of the hash value.
  • the embodiment of this application uses a maximum length linear feedback shift register to generate a long-period non-repeating pseudo-random number series.
  • the long-period non-repeating pseudo-random number series can realize: initializing constants, automatically generating additional inputs other than data, identifying data blocks, and generating Gap to enable free prefix filling.
  • the final output in this application is a variable-length hash value, which can solve the hash value collision problem.
  • the application predetermines a hash value length, and then starts to calculate the hash value. Initially, all The hashes all start from the first hash word, i.e. no bytes are discarded at the beginning of any hash, after that, if a hash collision occurs, all hashes involved in the collision will be adjusted to To eliminate collisions, the adjustment can be done by adding an extra hash word to the relevant hash value. There is a small problem with this, which is that different hashes will have different lengths.
  • the solution provided by this embodiment is to keep the entire hash value length unchanged by simultaneously discarding one hash word at the beginning of the hash value. In rare cases, this adjustment process may need to be repeated multiple times to resolve collisions, also That is to say, if the first adjustment does not solve the collision or brings in new collisions, the adjustment will continue until all collisions are eliminated. Also note that although the hash value is adjusted by adding and deleting an integer number of hash words, It's simple and efficient, but it's not necessary and adjustments can be done at the byte level if needed.
  • the two recalculated hash values will also be different, and the collision between them will be eliminated (but they may still collide with other hash values - if this happens In this case, you need to additionally calculate hash words for all relevant hash values); if h5 and h5' are the same, continue to additionally calculate hash words (while discarding the hash words at the beginning to keep the length of the hash value unchanged), until there are no collisions.
  • SIMD Single Instruction Multiple Data
  • an embodiment of the present application also provides a schematic structural diagram of a general lightweight hash processing method for calculating the hash value of structured packet data.
  • each group of data is filled, then all filled groups are concatenated, and finally the hash value of the entire concatenated data is calculated.
  • This filling first and then concatenating processing method is different from the traditional Concatenate first and then fill processing method.
  • the traditional concatenate first and then fill processing method cannot retain the structural information of the data because no group boundary information is reflected in the final filled data.
  • the fill first and then concatenate processing method can calculate a single unit for the entire data.
  • the purpose of the hash value is to retain the structural information of the data when free suffix filling or free prefix filling is used.
  • the filling first and then concatenation processing method is more suitable for calculating structured grouped data, such as a pair of keys and values, folders The file list in the ledger, the transaction list in the ledger, etc.
  • the hash algorithm in this embodiment can be used as a keyed hash function, that is, the key and the data are treated as two groups of structured data to calculate the hash value.
  • each set S i [a i , bi , c i , di ] consists of four 64-bit words.
  • this number of cycles should be changed to the general value q+64 in the general algorithm, where q is the number of words in each set S i number, in addition, the additional words in each set Si need to be added to the code in the same way as the original 4 words.
  • each hash The hexadecimal expression of the hash value must start with this number plus a colon. For example, if the hash value b-40b9442506...3d627a given in the previous example is the number of words in each Si is set to 6 calculated under the circumstances, then its complete form should be 6:b-40b9442506...3d627a.
  • the hash algorithm in this application only describes 64-bit operations
  • the basic algorithm can be applied almost unchanged to other digits, such as on some small devices with limited resources that do not support 64-bit operations. You can use 32-, 16-, or 8-bit operations instead of 64-bit operations.
  • the processor 120 includes: a state transition module 121, an initialization module 122, an update module 123 and a determination module 124;
  • the state conversion module 121 is used to: perform state conversion on the register, using the register state conversion function as a conversion tool, and selecting a linear feedback shift register for the register;
  • the initialization module 122 is used to: initialize the internal state of the hash algorithm, based on the output of the register in the state conversion module, and use the initialization algorithm as a tool for initialization;
  • the update module 123 is used to: utilize the output of the state conversion module and the initialization module, combined with the filling data, and use the update algorithm to update the initialized internal state of the hash algorithm;
  • the determination module 124 is used to: perform post-processing on the updated internal state and generate a final hash value. It can be understood that for the description of the above modules, please refer to the description of the methods shown in Figures 2 to 5, and will not be described again here.
  • each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more executable functions for implementing the specified logical function instruction. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures.
  • each block in the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or actions. , or can be implemented using a combination of dedicated hardware and computer instructions.
  • each functional module in each embodiment of the present application can be integrated together to form an independent part, each module can exist alone, or two or more modules can be integrated to form an independent part.
  • functions are implemented in the form of software function modules and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which can be a personal computer, a general lightweight hash processing system 100, or a network device, etc.) to execute all or part of the steps of the methods of various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code.
  • ROM read-only memory
  • RAM random access memory
  • magnetic disk or optical disk and other media that can store program code.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Power Engineering (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种通用轻量哈希处理方法、系统及可储存介质,通过选用线性反馈移位寄存器,并需利用寄存器状态转换函数对线性反馈移位寄存器进行状态转换;利用线性反馈移位寄存器的输出,结合初始化算法对哈希算法的内部状态进行初始化;输入需要计算哈希值的数据,利用更新算法更新所述初始化后的内部状态,所述需要计算哈希值的数据为填充后的数据;利用确定算法对更新后的内部状态进行后处理,生成最终的哈希值。解决了现有哈希处理的速度、安全问题,并且实现简单,可以灵活调节长度和起始点,利于平行快速计算,且所需存储空间小,在发生哈希碰撞时可以动态追加计算哈希字来消除碰撞,提高安全性。

Description

一种通用轻量哈希处理方法、系统及可储存介质 技术领域
本发明涉及信息技术的技术领域,尤其涉及一种通用轻量哈希处理方法、系统及可储存介质。
背景技术
哈希函数将任意大小定义域中的数据压缩到固定大小值域范围内。函数的输出称为输入数据的哈希值,或简称为哈希,多个不同的输入数据产生相同的哈希值这一现象称之为碰撞,对于某一哈希函数,如果从定义域中找出两个不同的输入数据以产生碰撞在计算上是不可行的,那我们说该哈希函数是抗碰撞的,值得注意的是,抗碰撞并不意味着没有碰撞,因为对于一个压缩函数而言,碰撞在理论上是不可避免的。
目前大多数哈希算法是基于
Figure PCTCN2022131905-appb-000001
结构(以下简称MD结构)设计的,包括最流行的MD5、SHA-1、以及SHA-2。由于基于MD结构的哈希函数所使用的内部状态和最终的哈希值大小一样,两者之间有比较简单和直接的联系,因此比较容易受到攻击。MD结构的安全性产生很大的问题,目前解决哈希安全的方法包括修改MD结构以产生更安全的MD变种算法以及开发基于非MD结构的新算法,例如宽管结构和基于海绵函数的新算法,然而宽管结构一般会使性能下降。目前基于海绵函数的算法被认为是比较先进和安全的算法,包括最新的SHA-3标准也是基于海绵函数设计的,但也有研究指出基于海绵函数的算法可用会遭受滑动攻击。另外,虽然SHA-3比SHA-2更高效,但仍然不够轻量,其速度也比MD5和SHA-1要慢。
技术问题
本发明解决的技术问题是:现有哈希处理算法结构复杂、性能低下以及易受攻击,安全性不高的问题。
技术解决方案
本部分的目的在于概述本发明的实施例的一些方面以及简要介绍一些较佳实施例。在本部分以及本申请的说明书摘要和发明名称中可能会做些简化或省略以避免使本部分、说明书摘要和发明名称的目的模糊,而这种简化或省略不能用于限制本发明的范围。
鉴于上述信息技术领域中现有存在问题,提出了本发明。
因此,本发明解决的技术问题是:现有哈希处理算法结构复杂、性能低下以及易受攻击,安全性不高的问题。
为解决上述技术问题,第一方面,本发明提供了一种通用轻量哈希处理方法,应用于通用轻量哈希处理系统,所述方法包括:选用线性反馈移位寄存器,并需利用寄存器状态转换函数对所述线性反馈移位寄存器进行状态转换;利用所述线性反馈移位寄存器的输出,结合初始化算法对哈希算法的内部状态进行初始化;输入需要计算哈希值的数据,利用更新算法更新所述初始化后的内部状态,所述需要计算哈希值的数据为填充后的数据;利用确定算法对所述更新后的内部状态进行后处理,生成最终的哈希值。
作为本发明所述的通用轻量哈希处理方法的一种优选方案,其中:所述初始化算法的公式表示为:
常量:切换掩膜m
输入:未初始化的内部矩阵M=[S 0;S 1;…;S n]和三个64位的字s,t,x
返回:初始化后的M,s,t,x
1 a 0←m
2 b 0←lfsr(a 0)
3 c 0←lfsr(b 0)
4 d 0←lfsr(c 0)
5 s←a 0+b 0+c 0+d 0
6 t←s
7 x←0
8 遍历i从1到n
9 a i←a i-1
10 遍历j从1到68
11 a i←lfsr(a i)
12 b i←lfsr(a i)
13 c i←lfsr(b i)
14 d i←lfsr(c i)
所述内部矩阵M的第i行S i=[a i,b i,c i,d i]直接被设置为从第i×68个状态开始的四个寄存器状态值,而所述切换掩膜m则用作所述寄存器的第一个状态值,另外所述s、t和x也被初始化,所述s和t被设置为(a 0+b 0+c 0+d 0),x则被设置为0。
作为本发明所述的通用轻量哈希处理方法的一种优选方案,其中:所述更新算法表示为:
输入:需要计算哈希值的数据data
结果:更新后的内部矩阵M=[S 0;S 1;…;S n]和三个64位的字s,t,x
1 函数update(data):
2 遍历data中的每一个64位的字w并执行以下操作
Figure PCTCN2022131905-appb-000002
Figure PCTCN2022131905-appb-000003
5 S 0←S 0·+x
6 t←t+w
Figure PCTCN2022131905-appb-000004
8 t←t+(t<<<31)
9 t←t+(t<<<15)
10 t←t+(t<<<7)
11 s←lfsr(s)
12 x←x+s
13 
Figure PCTCN2022131905-appb-000005
14 
Figure PCTCN2022131905-appb-000006
15 
Figure PCTCN2022131905-appb-000007
16 遍历i从1到n
17 
Figure PCTCN2022131905-appb-000008
18 S i←S i·+x
其中,
Figure PCTCN2022131905-appb-000009
为“异或”位运算,<<<为“向左旋转”位运算,>>为“逻辑右移”位运算,<<为“逻辑左移”位运算,
Figure PCTCN2022131905-appb-000010
为“点异或”位运算,·+为“点加”运算;所述更新算法使用所述输入数据的每个64位字来更新内部状态,所述输入数据被填充到包含一个或多个完整的64位字的长度,所述更新函数的输入为所述寄存器中的一个64位状态字、所述填充后的数据的一个64位字以及当前内部状态,所述更新函数的输出为每次所述更新函数调用所返回的新的内部状态。
作为本发明所述的通用轻量哈希处理方法的一种优选方案,其中:所述填充后的数据所使用的数据填充方法包括,自由后缀填充和自由前缀填充。
作为本发明所述的通用轻量哈希处理方法的一种优选方案,其中:所述自由前缀填充在于,
数据填充包含以下步骤:
第一步:简单地附加足够的0(如果原始数据正好包含一个或多个完整的64位字,则不需要附加0),
第二步:附加一个64位字w,这个w由数据长度(数据包含的字节数)z和字s计算得出,即:
s←lfsr(s)
Figure PCTCN2022131905-appb-000011
作为本发明所述的通用轻量哈希处理方法的一种优选方案,其中:所述确定算法将S 0=[a 0,b 0,c 0,d 0]与其余每个集合S i=[a i,b i,c i,d i]进行混合,然后将混合结果压缩以生成哈希字h i,其中i=1,2,…,n,所述确定算法表示为:
输入:内部矩阵M=[S 0;S 1;…;S n]
结果:哈希值H=h 1h 2…h n
1 遍历i从1到n
2 a←a 0
3 b←b 0
4 c←c 0
5 d←d 0
6 遍历j从1到9
7 a←a+(a<<<31)
Figure PCTCN2022131905-appb-000012
9 b←b+(b<<<15)
10 
Figure PCTCN2022131905-appb-000013
11 c←c+(c<<<7)
12 
Figure PCTCN2022131905-appb-000014
13 d←d+(d<<<3)
14 
Figure PCTCN2022131905-appb-000015
15 a i←a i+(a i>>>31)
16 
Figure PCTCN2022131905-appb-000016
17 b i←b i+(b i>>>15) 18 
Figure PCTCN2022131905-appb-000017
19 c i←c i+(c i>>>7)
20 
Figure PCTCN2022131905-appb-000018
21 d i←d i+(d i>>>3)
22 
Figure PCTCN2022131905-appb-000019
23 h i←a
其中,>>>为“向右旋转”位运算。
作为本发明所述的通用轻量哈希处理方法的一种优选方案,其中:所述更新算法和确定算法为双层结构设计,其中上层为所述更新算法,下层为所述确定算法。
第二方面,提供了一种通用轻量哈希处理系统,所述系统包括:处理器,网络模块以及存储器;其中,所述处理器和所述存储器通过所述网络模块通信,所述处理器从所述存储器中读取计算机程序并运行;
其中,状态转换模块用于:对寄存器进行状态转换,利用寄存器状态转换函数作为转换的工具,所述寄存器选用线性反馈移位寄存器;初始化模块用于:对哈希算法的内部状态进行初始化,根据所述状态转换模块中寄存器的输出,并使用初始化算法作为初始化的工具;更新模块用于:利用所述状态转换模块和初始化模块的输出,并结合填充数据,使用更新算法更新所述初始化后的哈希算法内部状态;确定模块用于:对所述更新后的内部状态进行后处理,生成最终的哈希值。
第三方面,提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现上述通用轻量哈希处理方法。
有益效果
本发明的有益效果:本申请实施例提供的通用轻量哈希处理方法、系统及可储存介质,通过利用平衡最大长度线性反馈移位寄存器,使用单一算法独立地计算每一个哈希字,解决了哈希处理的速度问题和安全问题,并且本申请的通用轻量哈希处理方法实现简单,可以灵活调节长度和起始点,利于平行快速计算,且所需存储空间小,在发生哈希碰撞时可以动态追加计算哈希字来消除碰撞,提高哈希处理方法的安全性。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附 图。其中:
图1为本申请实施例所述的一种通用轻量哈希处理系统的框图;
图2为本申请实施例所述的一种通用轻量哈希处理方法的流程示意图;
图3为本申请实施例所述的一种通用轻量哈希处理方法的整体结构示意图;
图4为本申请实施例所述的Galois线性反馈移位寄存器示意图;
图5为本申请实施例所述的一种通用轻量哈希处理方法的计算结构化分组数据哈希值的构造示意图;
图6为本申请实施例所述的一种通用轻量哈希处理系统处理器的方框流程图。
本发明的实施方式
为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合说明书附图对本发明的具体实施方式做详细的说明,显然所描述的实施例是本发明的一部分实施例,而不是全部实施例。基于本发明中的实施例,本领域普通人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明的保护的范围。
在下面的描述中阐述了很多具体细节以便于充分理解本发明,但是本发明还可以采用其他不同于在此描述的其它方式来实施,本领域技术人员可以在不违背本发明内涵的情况下做类似推广,因此本发明不受下面公开的具体实施例的限制。
其次,此处所称的“一个实施例”或“实施例”是指可包含于本发明至少一个实现方式中的特定特征、结构或特性。在本说明书中不同地方出现的“在一个实施例中”并非均指同一个实施例,也不是单独的或选择性的与其他实施例互相排斥的实施例。
本发明结合示意图进行详细描述,在详述本发明实施例时,所述示意图只是示例,其在此不应限制本发明保护的范围。此外,在实际制作中应包含长度、宽度及深度的三维空间尺寸。
同时在本发明的描述中,相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释,此外,术语“第一、第二或第三”仅用于描述目的,而不能理解为指示或暗示相对重要性。
本发明中除非另有明确的规定和限定,术语“安装、相连、连接”应做广义理解,例如:可以是固定连接、可拆卸连接或一体式连接;同样可以是机械连接、电连接或直接连接,也可以通过中间媒介间接相连,也可以是两个元件内部的连通。对于本领域的普通技术人员而言,可以具体情况理解上述术语在本发明中的具体含义。
图1出示了本申请实施例所提供的一种通用轻量哈希处理系统的框图。本申请实施 例中的通用轻量哈希处理系统可以为具有数据存储、传输和处理功能的服务端,如图1所示,通用轻量哈希处理系统100包括:处理器110,网络模块120以及存储器130。
存储器110、处理器120和网络模块130之间直接或间接地电性连接,以实现数据的传输或交互。例如,这些元件互相之间可以通过一条或多条通讯总线或信号线实现电性连接。存储器110中存储处理器120处理过程中产生的数据,处理器120通过运行存储在存储器110内的软件程序以及模块,从而执行各种功能应用以及通用轻量哈希处理。
其中,存储器110可以是,但不限于,随机存取存储器(Random Access Memory,RAMD),只读存储器(Read Only Memory,ROMD,可编程只读存储器(Programmable Read-0nlyMemory,PROMD),可擦除只读存储器(Erasable Programmable Read-0nly Memory,EPROMND,电可擦除只读存储器Electric Erasable Programmable Read-0nly Memory,EEPROMD等。其中,存储器110用于存储程序和处理器120的处理数据,处理器120在接收到执行指令后,执行程序。
网络模块130用于通过网络建立处理器120与其他通信终端设备之间的通信连接,实现网络信号及数据的收发操作。上述网络信号可包括无线信号或者有线信号。
可以理解,图1所示的结构仅为示意,通用轻量哈希处理系统100还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置。图1中所示的各组件可以采用硬件、软件或其合并实现。
本申请实施例还提供了一种计算机可存储介质,计算机可存储介质存储有计算机程序,计算机程序在运行时实现上述的方法。
参照图2~4,为申请本实施例提供的一种通用轻量哈希处理方法的示意图。方法有关的流程所定义的方法步骤应用于通用轻量哈希处理系统100,可以由处理器120实现,由图2~3可知,该方法包括以下步骤S1~S4:
S1:选用线性反馈移位寄存器,并需利用寄存器状态转换函数对线性反馈移位寄存器进行状态转换;
请参照图4,寄存器的64位比特被依次编号,从高到低为64、63、…、1,每位可取二进制值0或1,在图4中,寄存器当前状态为0110 1110…1100 1001。影响下一个状态的比特位叫做抽头,图中抽头序列为[64,61,59,57,…,8,6,5,3],可以等效地用特征多项式x 64+x 61+x 59+x 57+…+x 8+x 6+x 5+x 3+1来表示。在多项式中常数“1”并不代表某一个抽头,它所指的是一个比特位的输入,最右端的比特为输出比特,其所生成的序列被称为输出流。在数学运算时,抽头序列用切换掩膜的数值来表示,例如抽头序列[64,61,59,57,…, 8,6,5,3]对应于二进制值1001 0101…1011 0101或16进制值0x95…b5(这里前缀0x表示后面跟着的是16进制值),即切换掩膜为0x95…b5,有了切换掩膜,就可以从寄存器的当前状态计算出下一个状态,例如:假定当前状态为y,切换掩膜为m,则状态更新可以通过以下简单计算进行:
i←y&1
y←y>>1
如果i≠0
Figure PCTCN2022131905-appb-000020
其中,&为“与”位运算,>>为“逻辑右移”位运算,
Figure PCTCN2022131905-appb-000021
为“异或”位运算。
可选的,在具体实现中,线性反馈移位寄存器可选用一个64位的平衡最大长度线性反馈移位寄存器,其对应的切换掩膜为m=0x95ac9329ac4bc9b5,将其作为常量,将当前寄存器状态y输入至函数中得到新寄存器状态y,其函数为:
常量:切换掩膜m=0x95ac9329ac4bc9b5
输入:当前寄存器状态y
返回:新寄存器状态y
1 函数lfsr(y):
2 i←y&1
3 y←y>>1
4 如果i≠0
Figure PCTCN2022131905-appb-000022
6 返回y
经过计算后,最后的y值就是新的状态,在计算时,寄存器的初始状态不能为零,否则其状态不会改变,将永远为零。
获得哈希值需要通过三个步骤进行计算,初始化、更新和确定,所获得的哈希值为可变长度的哈希值,且每个哈希值包含一个或多个64位的无符号整数,为方便起见,在接下来的描述中我们将64位的无符号整数称为64位的字或简单称为字,除了初始化阶段,哈希值的每个字都是互不相关和独立计算的,使S i=[a i,b i,c i,d i](i=0、1、2、...、n)表示包含四个字a i、b i、c i和d i的集合,哈希算法的内部状态可以表示为一个(n+1)×4的内部矩阵M,其中M:
Figure PCTCN2022131905-appb-000023
第一组集合S 0参与计算每个哈希字,其他集合S 1,S 2,…,S n仅用于计算相应的哈希字h 1,h 2,…,h n
S2:利用线性反馈移位寄存器的输出,结合初始化算法对哈希算法的内部状态进行初始化。
可选的,在本实施例中,初始化算法的公式表示为:
常量:切换掩膜m=0x95ac9329ac4bc9b5
输入:未初始化的内部矩阵M=[S 0;S 1;…;S n]和三个64位的字s,t,x
返回:初始化后的M,s,t,x
1 a 0←m
2 b 0←lfsr(a 0)
3 c 0←lfsr(b 0)
4 d 0←lfsr(c 0)
5 s←a 0+b 0+c 0+d 0
6 t←s
7 x←0
8 遍历i从1到n
9 a i←a i-1
10 遍历j从1到68
11 a i←lfsr(a i)
12 b i←lfsr(a i)
13 c i←lfsr(b i)
14 d i←lfsr(c i)
内部矩阵M的第i行S i=[a i,b i,c i,d i]直接被设置为从第i×68个状态开始的四个寄存器状态值,而切换掩膜m则用作寄存器的第一个状态值,另外字s、t和x也被初始化,字s和t被设置为(a 0+b 0+c 0+d 0),x则被设置为0。
初始化过程确保所有字a 0、b 0、c 0、d 0、a 1、b 1、c 1、d 1、...为非零值且各 不相同,任意两个集合S i和S j(i≠j)不重叠,即一个集合中的任何字与另一个集合中的任何字相距超过64个寄存器状态,例如:集合S 1中的第一个词a 1与集合S 0中最后一个字d 0相距65个寄存器状态。
S3:输入需要计算哈希值的数据,利用更新算法更新初始化后的内部状态,需要计算哈希值的数据为填充后的数据。
可选的是,在本实施例中,更新算法表示为:
输入:需要计算哈希值的数据data
结果:更新后的内部矩阵M=[S 0;S 1;…;S n]和三个64位的字s,t,x
1 函数update(data):
2 遍历data中的每一个64位的字w并执行以下操作
Figure PCTCN2022131905-appb-000024
Figure PCTCN2022131905-appb-000025
5 S 0←S 0·+x
6 t←t+w
Figure PCTCN2022131905-appb-000026
8 t←t+(t<<<31)
9 t←t+(t<<<15)
10 t←t+(t<<<7)
11 s←lfsr(s)
12 x←x+s
13 
Figure PCTCN2022131905-appb-000027
14 
Figure PCTCN2022131905-appb-000028
15 
Figure PCTCN2022131905-appb-000029
16 遍历i从1到n
17 
Figure PCTCN2022131905-appb-000030
18 S i←S i·+x
其中,
Figure PCTCN2022131905-appb-000031
为“异或”位运算,<<<为“向左旋转”位运算,>>为“逻辑右移”位运算,<<为“逻辑左移”位运算,
Figure PCTCN2022131905-appb-000032
为“点异或”位运算,·+为“点加”运算;两个点运算符
Figure PCTCN2022131905-appb-000033
Figure PCTCN2022131905-appb-000034
和“·+”对两个复合操作数(或一个复合操作数和一个常规操作数)的各个组成元素逐一进行运算。例如,如果S 1=[a 1,b 1,c 1,d 1]以及S 0=[a 0,b 0,c 0,d 0],则
Figure PCTCN2022131905-appb-000035
Figure PCTCN2022131905-appb-000036
如果x是一个常规操作数,则S 1·+x=[a 1+x,b 1+x,c 1+x,d 1+x]。在本申请中,复合变量用大写字母表示,常规变量用小写字母表示。
更新算法使用输入数据的每个64位字来更新内部状态,输入数据被填充到包含一个或多个完整的64位字的长度,更新函数的输入为寄存器中的一个64位状态字、所述填充后的数据的一个64位字以及当前内部状态,更新函数的输出为每次更新函数调用所返回的新的内部状态。
进一步的是,使用的数据填充方法包括,自由后缀填充和自由前缀填充,以自由前缀填充为例,上述填充数据的过程包括两步,
第一步:简单地附加足够的0(如果原始数据正好包含一个或多个完整的64位字,则不需要附加0);
第二步:附加一个64位字w,这个w由数据长度(数据包含的字节数)z和字s计算得出,即:
s←lfsr(s)
Figure PCTCN2022131905-appb-000037
以上第二步调用了一次状态转换函数,从而额外消耗掉一个线性寄存器内部状态。从图3中可以看出,填充过的数据的最后一个字设置为
Figure PCTCN2022131905-appb-000038
其中z是未填充过的数据中包含的字节数,这会从寄存器中额外消耗掉一个字,在效果上会在最后两个寄存器输入S k-1和S k+1之间产生一个间隙,而这一缺口是实现自由前缀填充的关键。
在第一组集合S 0中四个字a 0、b 0、c 0和d 0的值在整个更新过程中始终保持各不相同,并且任何其他集合S i(i>0)中四个字a i、b i、c i和d i的值在更新步骤中不能保证各不相同,但任何两个字,例如a i和b i都不能连续两次相等,因为一旦相等,下一次更新时它们将和两个值不相等的字a 0和b 0分别进行异或操作,产生的运算结果必定不相等。
在更新算法内的数据输入中,内部状态字s可以看作是一个特殊的计数器,它不像普通计数器那样每次增加1,而只是保证每次取不同的值,这个字实际上作为除了正常的数据输入之外的第二个(自动生成的)输入,有唯一标识每个正常数据字的作用,它增强了哈希算法的安全性,并且是使算法具有自由前缀填充属性的关键。
S4:通用轻量哈希处理系统利用确定算法对更新后的内部状态进行后处理,生成最终的哈希值。
可选的是,本实施例中所使用的的确定算法将S 0=[a 0,b 0,c 0,d 0]与其余每个集合S i=[a i,b i,c i,d i]进行混合,然后将混合结果压缩以生成哈希字h i,其中i=1,2,…,n,确 定算法表示为:
输入:内部矩阵M=[S 0;S 1;…;S n]
结果:哈希值H=h 1h 2…h n
1 遍历i从1到n
2 a←a 0
3 b←b 0
4 c←c 0
5 d←d 0
6 遍历j从1到9
7 a←a+(a<<<31)
Figure PCTCN2022131905-appb-000039
9 b←b+(b<<<15)
10 
Figure PCTCN2022131905-appb-000040
11 c←c+(c<<<7)
12 
Figure PCTCN2022131905-appb-000041
13 d←d+(d<<<3)
14 
Figure PCTCN2022131905-appb-000042
15 a i←a i+(a i>>>31)
16 
Figure PCTCN2022131905-appb-000043
17 b i←b i+(b i>>>15)
18 
Figure PCTCN2022131905-appb-000044
19 c i←c i+(c i>>>7)
20 
Figure PCTCN2022131905-appb-000045
21 d i←d i+(d i>>>3)
22 
Figure PCTCN2022131905-appb-000046
23 h i←a
其中,>>>为“向右旋转”位运算。
最终的哈希值H是每个哈希字h i的串联,也就是说,对于一个包含n个字的哈希值:H=h 1h 2…h i,是可变长度的哈希值。
进一步的是,更新算法和确定算法为双层结构设计,其中上层为所述更新算法,下 层为所述确定算法。其中上层利用更新算法吸收数据并在该过程中更新内部状态,这一层使用简单高效且可以平行运算的算法来实现高速运行。下层利用确定算法对吸收完数据后的内部状态进行充分的混合,然后以高压缩比压缩生成最终的哈希值,从而增强函数的抗碰撞性和单向性。上下两层使用不同的压缩函数,不仅可以防止滑动攻击,还可以针对不同要求对每一层分别进行优化。
在本实施例中,初始化后的字s被加载在线性反馈移位寄存器中,之后字s被不断更新,在更新时需要反复调用更新算法。字s只通过状态转换函数更新,不受任何输入数据的影响,它不被视为内部状态的一部分。字t和x同时受到输入数据和字s的影响,它们是内部状态的一部分。
采用双层设计以兼顾速度和安全。
本申请所计算的哈希值可以从任意哈希字的任意字节开始,也可以在任何字节处结束,让B ij代表第i个哈希字h i的第j个字节,则H=h 1h 2...h n=B 11B 12...B 18B 21B 22...B 28...B n1B n2...B n8,例如:应用程序可以选择在哈希值开头丢弃11个字节,在结尾丢弃5个字节,从而得到一个新的哈希值H=B 24B 25B 26B 27B 28...B n1B n2B n3,当以十六进制字符串表示哈希值时,开头丢弃的字节数(例如11,或十六进制的b)会作为哈希值的前缀一起显示。
本申请实施例利用最大长度线性反馈移位寄存器生成长周期不重复伪随机数系列,长周期不重复伪随机数系列能够实现:初始化常量、自动生成除数据外的额外输入、标识数据块以及产生缺口以实现自由前缀填充的功能。
另外本申请中最终输出的为可变长度哈希值,可以解决哈希值碰撞问题,在一个典型的应用场景中,应用程序预先确定一个哈希值长度,然后开始计算哈希值,最初所有的哈希值都从第一个哈希字开始,即没有字节在任何哈希值的开头处被丢弃,之后,如果出现哈希值碰撞,所有涉及碰撞的哈希值都将被调整以消除碰撞,调整可以通过将一个额外的哈希字添加到相关的哈希值来完成。这样做有一个小问题,即不同哈希值将会有不同的长度。本实施例提供的解决方案是通过同时在哈希值开头处丢弃一个哈希字来保持整个哈希值长度不变,在极少数情况下,可能需要多次重复此调整过程来解决碰撞,也就是说,如果第一次调整没有解决碰撞或带进了新的碰撞,则会继续不断调整,直到所有的碰撞被消除,另外注意,虽然通过添加和删除整数个哈希字来调整哈希值既简单又高效,但这不是必要的,调整也可以根据需要在字节的层面上完成。例如:如果发生碰撞,两个不同文件的哈希值都是h 1h 2h 3h 4,则会追加计算一个哈希字,使两个哈希值变成h 1h 2h 3h 4h 5和h 1h 2h 3h 4h 5’,为了保 持哈希值长度不变,都丢弃开头的h 1就变成了h 2h 3h 4h 5和h 2h 3h 4h 5’。如果追加计算的h 5和h 5’不同,则重新计算后的两个哈希值也不同,它们之间的碰撞就会消除(但它们还有可能和其他哈希值碰撞――如果发生这种情况,则需要对所有相关哈希值追加计算哈希字);如果h5和h5’一样,则继续追加计算哈希字(同时丢弃开头处哈希字以保持哈希值长度不变),直到没有碰撞为止。
并且本申请实施例中设计的哈希算法是面向单指令多数据(Single Instruction Multiple Data,SIMD)内在指令优化设计的,SIMD内在指令在当今大多数CPU上和所有GPU上得到支持,可大幅提高计算速度。
参照图5,本申请实施例还提供了一种通用轻量哈希处理方法计算结构化分组数据哈希值的构造示意图。
由图5可以看出,对于结构化分组数据,填充数据的每个分组,然后串联所有填充过的分组,最后计算整个串联数据的哈希值,这种先填充后串联处理方法不同于传统的先串联后填充处理方法,传统的先串联后填充处理方法无法保留数据的结构信息,因为没有分组边界信息反映在最终的填充数据中,而先填充后串联处理方法,能够实现为整个数据计算单一哈希值的目的,在采用自由后缀填充或自由前缀填充的情况下,可以保留数据的结构信息,先填充后串联处理方法更适合于计算结构化分组数据,例如一对键和值、文件夹中的文件列表、记账本中的交易列表等等。
另外通过使用结构化分组数据,本实施例中的哈希算法可以被当做一个键控哈希函数来使用,即将键和数据当做结构化数据的两个分组来计算哈希值。
在上述的两个实施例的基本算法中,每个集合S i=[a i,b i,c i,d i]由4个64位字组成,在更通用的设计中,每个集合S i中包含的64位字的个数是可以根据需要增加的,增加字的个数会提高安全性,但也会降低运行速度,虽然字的个数可以是不小于4的任何值,但建议使用偶数或2的指数,以利用SIMD内在指令的优势:很容易修改基本算法来实现在每个集合S i中使用更多的字,更新算法第二个循环的循环次数设置为每个集合S i中字的个数和每个字中的位数,即4+64=68,因此,这个循环次数在一般算法中应改为通用值q+64,其中q是每个集合S i中字的个数,除此之外,需要将每个集合S i中的增加的字以对待原先4个字一样的方式添加到代码中,当每个集合S i中字的个数不是4时,每个哈希值的十六进制表达式必须以这个个数加上一个冒号开始,例如,如果前面举例给的哈希值b-40b9442506...3d627a是在每个S i中字的个数设置为6的情况下计算所得,那么它的完整形式应为6:b-40b9442506...3d627a。
因此本申请的哈希算法能够在不改变基本算法的情况下,只通过调节每个集合S i=[a i,b i,c i,d i,…]中包含的字的个数来动态调整LHA算法的安全性,通过动态调整哈希算法的安全性,以最小的代价有效应对各种新的和未知的攻击。
另外,虽然本申请的哈希算法只对64位的运算作了描述,但基本算法可以几乎原封不动地应用到其他位数,比如在一些不支持64位运算的资源紧缺的小型设备上,可以用32、16、或8位运算来代替64位运算。
参照图6,基于上述同样的发明构思,处理器120包括:状态转换模块121、初始化模块122、更新模块123和确定模块124;
状态转换模块121用于:对寄存器进行状态转换,利用寄存器状态转换函数作为转换的工具,寄存器选用线性反馈移位寄存器;
初始化模块122用于:对哈希算法的内部状态进行初始化,根据状态转换模块中寄存器的输出,并使用初始化算法作为初始化的工具;
更新模块123用于:利用状态转换模块和初始化模块的输出,并结合填充数据,使用更新算法更新初始化后的哈希算法内部状态;
确定模块124用于:对更新后的内部状态进行后处理,生成最终的哈希值。可以理解的是,关于上述模块的描述可以参阅对图2~5所示的方法的说明,在此不再赘述。
在本申请实施例所提供的几个实施例中,应该理解到,所揭露的装置和方法,也可以通过其它的方式实现。以上所描述的装置和方法实施例仅仅是示意性的,例如,附图中的流程图和框图显示了根据本申请的多个实施例的装置、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现方式中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
另外,在本申请各个实施例中的各功能模块可以集成在一起形成一个独立的部分,也可以是各个模块单独存在,也可以两个或两个以上模块集成形成一个独立的部分。
功能如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技 术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,通用轻量哈希处理系统100,或者网络设备等)执行本申请各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-0nly.Memory)、随机存取存储器(RAM,RandomAccessMemory)、磁碟或者光盘等各种可以存储程序代码的介质。需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括--系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一....”限定的要素,并不排除在包括要素的过程、方法、物品或者设备中还存在另外的相同要素。
应说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或者等同替换,而不脱离本发明技术方案的精神和范围,其均应涵盖在本发明的权利要求范围当中。

Claims (10)

  1. 一种通用轻量哈希处理方法,其特征在于,应用于通用轻量哈希处理系统:所述方法包括,
    选用线性反馈移位寄存器,并需利用寄存器状态转换函数对所述线性反馈移位寄存器进行状态转换;
    利用所述线性反馈移位寄存器的输出,结合初始化算法对哈希算法的内部状态进行初始化;
    输入需要计算哈希值的数据,利用更新算法更新所述初始化后的内部状态,所述需要计算哈希值的数据为填充后的数据;
    利用确定算法对所述更新后的内部状态进行后处理,生成最终的哈希值。
  2. 如权利要求1所述的通用轻量哈希处理方法,其特征在于:所述初始化算法的公式表示为:
    常量:切换掩膜m
    输入:未初始化的内部矩阵M=[S 0;S 1;…;S n]和三个64位的字s,t,x
    返回:初始化后的M,s,t,x
    1 a 0←m
    2 b 0←lfsr(a 0)
    3 c 0←lfsr(b 0)
    4 d 0←lfsr(c 0)
    5 s←a 0+b 0+c 0+d 0
    6 t←s
    7 x←0
    8 遍历i从1到n
    9 a i←a i-1
    10 遍历j从1到68
    11 a i←lfsr(a i)
    12 b i←lfsr(a i)
    13 c i←lfsr(b i)
    14 d i←lfsr(c i)
    所述内部矩阵M的第i行S i=[a i,b i,c i,d i]直接被设置为从第i×68个状态开 始的四个寄存器状态值,而所述切换掩膜m则用作所述寄存器的第一个状态值,另外所述s、t和x也被初始化,所述s和t被设置为(a 0+b 0+c 0+d 0),x则被设置为0。
  3. 如权利要求2所述的通用轻量哈希处理方法,其特征在于:所述更新算法表示为:
    输入:需要计算哈希值的数据data
    结果:更新后的内部矩阵M=[S 0;S 1;…;S n]和三个64位的字s,t,x
    1 函数update(data):
    2 遍历data中的每一个64位的字w并执行以下操作
    Figure PCTCN2022131905-appb-100001
    Figure PCTCN2022131905-appb-100002
    5 S 0←S 0·+x
    6 t←t+w
    Figure PCTCN2022131905-appb-100003
    8 t←t+(t<<<31)
    9 t←t+(t<<<15)
    10 t←t+(t<<<7)
    11 s←lfsr(s)
    12 x←x+s
    13 
    Figure PCTCN2022131905-appb-100004
    14 
    Figure PCTCN2022131905-appb-100005
    15 
    Figure PCTCN2022131905-appb-100006
    16 遍历i从1到n
    17 
    Figure PCTCN2022131905-appb-100007
    18 S i←S i·+x
    其中,
    Figure PCTCN2022131905-appb-100008
    为“异或”位运算,<<<为“向左旋转”位运算,>>为“逻辑右移”位运算,<<为“逻辑左移”位运算,
    Figure PCTCN2022131905-appb-100009
    为“点异或”位运算,·+为“点加”运算;所述更新算法使用所述输入数据的每个64位字来更新内部状态,所述输入数据被填充到包含一个或多个完整的64位字的长度,所述更新函数的输入为所 述寄存器中的一个64位状态字、所述填充后的数据的一个64位字以及当前内部状态,所述更新函数的输出为每次所述更新函数调用所返回的新的内部状态。
  4. 如权利要求1~3任一所述的通用轻量哈希处理方法,其特征在于:所述填充后的数据所使用的数据填充方法包括,
    自由后缀填充和自由前缀填充。
  5. 如权利要求4所述的通用轻量哈希处理方法,其特征在于:所述自由前缀填充在于,
    数据填充包含以下步骤:
    第一步:简单地附加足够的0(如果原始数据正好包含一个或多个完整的64位字,则不需要附加0),
    第二步:附加一个64位字w,这个w由数据长度(数据包含的字节数)z和字s计算得出,即:
    s←lfsr(s)
    Figure PCTCN2022131905-appb-100010
  6. 如权利要求5所述的通用轻量哈希处理方法,其特征在于:所述确定算法将S 0=[a 0,b 0,c 0,d 0]与其余每个集合S i=[a i,b i,c i,d i]进行混合,然后将混合结果压缩以生成哈希字h i,其中i=1,2,…,n,所述确定算法表示为:
    输入:内部矩阵M=[S 0;S 1;…;S n]
    结果:哈希值H=h 1h 2…h n
    1 遍历i从1到n
    2 a←a 0
    3 b←b 0
    4 c←c 0
    5 d←d 0
    6 遍历j从1到9
    7 a←a+(a<<<31)
    Figure PCTCN2022131905-appb-100011
    9 b←b+(b<<<15)
    10 
    Figure PCTCN2022131905-appb-100012
    11 c←c+(c<<<7)
    12 
    Figure PCTCN2022131905-appb-100013
    13 d←d+(d<<<3)
    14 
    Figure PCTCN2022131905-appb-100014
    15 a i←a i+(a i>>>31)
    16 
    Figure PCTCN2022131905-appb-100015
    17 b i←b i+(b i>>>15)
    18 
    Figure PCTCN2022131905-appb-100016
    19 c i←c i+(c i>>>7)
    20 
    Figure PCTCN2022131905-appb-100017
    21 d i←d i+(d i>>>3)
    22 
    Figure PCTCN2022131905-appb-100018
    23 h i←a
    其中,>>>为“向右旋转”位运算。
  7. 如权利要求1~3、5~6任一所述的通用轻量哈希处理方法,其特征在于:所述更新算法和确定算法为双层结构设计,其中上层为所述更新算法,下层为所述确定算法。
  8. 一种通用轻量哈希处理系统,其特征还在于:包括,
    处理器,
    网络模块以及
    存储器;
    其中,所述处理器和所述存储器通过所述网络模块通信,所述处理器从所述存储器中读取计算机程序并运行,以执行权力要求1~7任一项所述的方法。
  9. 如权利要求8所述的通用轻量哈希处理系统,其特征还在于:所述处理器包括,
    状态转换模块用于:对寄存器进行状态转换,利用寄存器状态转换函数作为转换的工具,所述寄存器选用线性反馈移位寄存器;
    初始化模块用于:对哈希算法的内部状态进行初始化,根据所述状态转换模块中寄存器的输出,并使用初始化算法作为初始化的工具;
    更新模块用于:利用所述状态转换模块和初始化模块的输出,并结合填充数据,使用更新算法更新所述初始化后的哈希算法内部状态;
    确定模块用于:对所述更新后的内部状态进行后处理,生成最终的哈希值。
  10. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现如权利要求1至8任一所述的通用轻量哈希处理方法。
PCT/CN2022/131905 2022-07-04 2022-11-15 一种通用轻量哈希处理方法、系统及可储存介质 WO2024007506A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/348,872 US20240007269A1 (en) 2022-07-04 2023-07-07 General-Purpose Lightweight Hash Processing Method and System and Storable Medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210787320.0A CN115378575A (zh) 2022-07-04 2022-07-04 一种通用轻量哈希处理方法、系统及可储存介质
CN202210787320.0 2022-07-04

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/348,872 Continuation US20240007269A1 (en) 2022-07-04 2023-07-07 General-Purpose Lightweight Hash Processing Method and System and Storable Medium

Publications (1)

Publication Number Publication Date
WO2024007506A1 true WO2024007506A1 (zh) 2024-01-11

Family

ID=84062277

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/131905 WO2024007506A1 (zh) 2022-07-04 2022-11-15 一种通用轻量哈希处理方法、系统及可储存介质

Country Status (2)

Country Link
CN (1) CN115378575A (zh)
WO (1) WO2024007506A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116186745B (zh) * 2023-04-27 2023-07-18 暗链科技(深圳)有限公司 哈希加密方法、非易失性可读存储介质及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101753291A (zh) * 2008-11-28 2010-06-23 佳能株式会社 哈希值计算装置及其方法
JP2017058501A (ja) * 2015-09-16 2017-03-23 日本電信電話株式会社 ハッシュ関数計算装置および方法
CN109088718A (zh) * 2018-07-11 2018-12-25 上海循态信息科技有限公司 基于线性反馈移位寄存器的保密增强方法及系统
CN111464308A (zh) * 2020-03-12 2020-07-28 烽火通信科技股份有限公司 一种实现多种哈希算法可重构的方法和系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101753291A (zh) * 2008-11-28 2010-06-23 佳能株式会社 哈希值计算装置及其方法
JP2017058501A (ja) * 2015-09-16 2017-03-23 日本電信電話株式会社 ハッシュ関数計算装置および方法
CN109088718A (zh) * 2018-07-11 2018-12-25 上海循态信息科技有限公司 基于线性反馈移位寄存器的保密增强方法及系统
CN111464308A (zh) * 2020-03-12 2020-07-28 烽火通信科技股份有限公司 一种实现多种哈希算法可重构的方法和系统

Also Published As

Publication number Publication date
CN115378575A (zh) 2022-11-22

Similar Documents

Publication Publication Date Title
KR102137956B1 (ko) 블록 마이닝 방법 및 장치
CN109639428B (zh) 从位混合器构造安全散列函数的方法
US8787563B2 (en) Data converter, data conversion method and program
WO2024007506A1 (zh) 一种通用轻量哈希处理方法、系统及可储存介质
Lemire et al. Strongly universal string hashing is fast
Bernstein et al. Really fast syndrome-based hashing
Ye et al. A further study of the linear complexity of new binary cyclotomic sequence of length p^ r pr
Tang et al. Binary linear codes from vectorial Boolean functions and their weight distribution
US20240007269A1 (en) General-Purpose Lightweight Hash Processing Method and System and Storable Medium
US8225100B2 (en) Hash functions using recurrency and arithmetic
US7895347B2 (en) Compact encoding of arbitrary length binary objects
US10387350B1 (en) Configurable sponge function engine
Finiasz et al. Improved fast syndrome based cryptographic hash functions
WO2002101984A1 (en) Method and apparatus for creating a message digest using a multiple round one-way hash algorithm
JP2009169316A (ja) ハッシュ関数演算装置及び署名装置及びプログラム及びハッシュ関数演算方法
CN109687972B (zh) 一种支持多种Hash算法的电路
WO2022247193A1 (zh) 用于数据处理的装置、方法、芯片、计算机设备及介质
Sagar Cryptographic Hashing Functions-MD5
Rohit et al. Practical Forgery attacks on Limdolen and HERN
Sarkar Domain extender for collision resistant hash functions: Improving upon Merkle–Damgård iteration
Alahmad et al. Multicollisions in sponge construction
US20230297693A1 (en) Information processing apparatus, information processing method, and non-transitory computer readable medium storing program
US11368166B2 (en) Efficient encoding methods using bit inversion and padding bits
Atighehchi A precise non-asymptotic complexity analysis of parallel hash functions without tree topology constraints
US20220376892A1 (en) Speeding up hash-chain computations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22950064

Country of ref document: EP

Kind code of ref document: A1