CN109861811A

CN109861811A - A kind of Fast implementation of overlapping subsequence detection

Info

Publication number: CN109861811A
Application number: CN201910179247.7A
Authority: CN
Inventors: 杨先伟; 朱翔; 屈寅春
Original assignee: Wuxi Institute of Technology
Current assignee: Wuxi Institute of Technology
Priority date: 2019-03-10
Filing date: 2019-03-10
Publication date: 2019-06-07

Abstract

The present invention relates to information security fields, more particularly, to a kind of Fast implementation of overlapping subsequence detection.This method step are as follows: 5 seat sequence frequencies of statistics；1 is calculated to 4 seat sequence frequencies；Calculate intermediate variable；Counting statistics value；Multilevel iudge compares statistical value,,,Whether the size between threshold value passes through detection with determination sequence to be checked.The present invention is by parameterm=2 Hesm=5 joint accounts, while optimizing overlapping subsequence testing process, solve the problems, such as that computer detection efficiency in the overlapping subsequence detection for executing sequence to be checked is lower.

Description

A kind of Fast implementation of overlapping subsequence detection

Technical field

The present invention relates to information security fields, more particularly, to a kind of Fast implementation of overlapping subsequence detection.

Background technique

Randomizer is the foundation for security of field of cryptography, and the safety for providing essence for numerous cryptographic techniques is protected Card.The output of randomizer is referred to as random number, is widely used in every field, such as symmetric cryptographic algorithm and asymmetric The key generation of cryptographic algorithm, resists side letter at the challenging value in challenge-response mechanism, the secret information in digital signature scheme Trace analysis attack etc..The detection of quality of random numbers can use more general statistical test method, whether check output quality There are obvious shortcomings.Generally the test item of several statistical properties for different aspect is merged together to form a set of statistics survey External member is tried, user detects sequence to be checked using protos test suite PROTOS, and passes through the quality of random number according to testing result.Common system The special publication document " AStatisticalTestSui that meter protos test suite PROTOS has National Institute of Standards and Technology to issue TeforRandomandPseudorandomNumberGeneratorsforCryptograph icApplications " and I Standard GB/T/T32915-2016 " information security technology binary sequence randomness detecting method " etc. that state promulgated in 2016. The two standard documents all suggested 15 statistical detection methods, and overlapping subsequence detection method is that they suggest use one Kind test method.This statistical test method is used to test in sequence to be checked a of each mode that m can be overlapped subsequence Whether number approaches.

For convenience of description, remember that the sequence to be tested that length is n-bit is binary sequence ε.Remember the seat m sequence pattern i₁i₂… i_mThe frequency of appearance is v [i₁i₂…i_m].China's examination criteria provides that n is 1,000,000.

Tradition overlapping subsequence testing process is as follows:

The first step constructs a new sequence ε ' with the binary sequence ε to be tested that length is n-bit, and make is by sequence The length at the end for most starting m-1 data and being added to sequence of column, new sequence is n+m-1, and m takes 2 and 5；

Second step calculates each seat m sequence pattern i in new sequence₁i₂…i_mFrequency v [the i of appearance₁i₂…i_m] (altogether Have 2^mKind), each seat m-1 sequence pattern i₁i₂…i_m-1Frequency v [the i of appearance₁i₂…i_m-1] (share 2^m-1Kind), each The seat m-2 sequence pattern i₁i₂…i_m-2Frequency v [the i of appearance₁i₂…i_m-2] (share 2^m-2Kind)；

Third step calculates

4th step calculates

5th step calculates P value,

6th step P-value1 >=α and P-value2 >=α, then sequence to be checked passes through detection.

A large amount of experiment shows that the efficiency of existing implementation is lower, because implementation has following many disadvantages at present End: first, practical data to be tested are byte representation, but detection process uses and is based on bits count, only handles a ratio every time Spy, processor word size do not make full use of；Second, subsequence the frequency m=2 and m=5 of different length are counted respectively, this is just needed Five identical data loads are carried out to calculate 1 to 5 subsequence Frequency statistics.Third, P value judgment mode makes often It is secondary to calculate igamc function.In view of this, the implementation of actually detected middle needs more efficient quick, and then mention significantly The efficiency of high entire detection external member, so overlapping subsequence detection is fast implemented with very important realistic meaning.

Summary of the invention

The purpose of the present invention is in view of the above shortcomings, providing a kind of Fast implementation of overlapping subsequence detection, By parameter m=2 and m=5 joint account, while optimizing overlapping subsequence testing process, solves computer and executing sequence to be checked Overlapping subsequence detection when the lower problem of detection efficiency.

The present invention adopts the following technical solutions to achieve:

A kind of Fast implementation of overlapping subsequence detection, which is characterized in that method includes the following steps:

S1,5 seat sequence frequencies of statistics, i.e. the frequency v to 5 seat sequence of sequence statistic to be checked⁽⁵⁾[i], 0≤i≤31, That is i=0,1,2,3 ..., 31.

S2,1 is calculated to 4 seat sequence frequencies, i.e., obtain 1,2 using the frequency for 5 seat sequences being calculated Position, 3,4 seat sequence frequencies, are v respectively⁽¹⁾[i1], 0≤i₁≤ 1, i.e. i=0,1；v⁽²⁾|i₂|, 0≤i₂≤ 3, i.e. i₂=0, 1,2,3；v⁽³⁾[i₃], 0≤i₃≤ 7, i.e. i₃=0,1,2 ..., 7；v⁽⁴⁾[i₄], 0≤i₄≤ 15, i.e. i₄=0,1,2 ..., 15.

S3, intermediate variable is calculated, that is, calculates overlapping subsequence and detects intermediate variable,

N is sequence bits length to be checked, v⁽ⁿ⁾[i₁,i₂,...,i_n] calculating process see S1 and S2；

S4, counting statistics value calculate overlapping subsequence detection statistic values,

S5, multilevel iudge, compare statistical valueSize between threshold value, Whether pass through detection with determination sequence to be checked.

Further, the step S1 counts 5 seat sequence frequencies, without by sequence to be checked from byte split be bit, Disposable load w-byte can be taken, is denoted as the processing mode of a word, w takes 1 or 2 or 4 or 8.

Further, the step S1 counts 5 seat sequence frequencies, and the xth bit of note word A first is to y bit A_[x..y]；Note word A's and B is spliced into A | | B=2^8wA+B；Then following steps are executed to j=0 to n-1:

S11, obtain location variable, by j be split as location information b, j=8wa+b, 0≤b in the location information a and word of word < 8w。

S12,5 bit values of acquisition obtain a-th of word A, take x=A if b+5 < 8w_[b:b+5]；Otherwise a-th of word is obtained A and the a+1 word B takes x=(B | | A)_[b:b+5]。

S13, frequency, v are updated⁽⁵⁾[x]=v⁽⁵⁾[x]+1。

Further, the step S2 calculates 1 to 4 seat sequence frequencies, and calculation is, by 5 seat sequence frequencies 4 seat sequence frequencies are calculated, calculate 3 seat sequence frequencies by 4 seat sequence frequencies, calculate 2 seat sequences by 3 seat sequence frequencies Column frequency calculates 1 seat sequence frequency by 2 seat sequence frequencies.

Further, the step S2 calculates 1 to 4 seat sequence frequencies, and calculation can also be, by 5 seat sequences Column frequency directly calculates 1 to 4 seat sequence frequencies.

Further, the step S2 calculates 1 to 4 seat sequence frequencies, and 5 seat sequence frequencies calculate 4 seat sequences The implementation of frequency is v⁽⁴⁾[k]=v⁽⁵⁾[k]+v⁽⁵⁾[16+k], 0≤k≤15；4 seat sequence frequencies calculate 3 seat sequences The implementation of frequency is v⁽³⁾[k]=v⁽⁴⁾[k]+v⁽⁴⁾[8+k], 0≤k≤7；3 seat sequence frequencies calculate 2 seat sequence frequencies Several implementations is v⁽²⁾[k]=v⁽³⁾[k]+v⁽³⁾[4+k], 0≤k≤3；2 seat sequence frequencies calculate 1 seat sequence frequency Implementation be v⁽¹⁾[k]=v⁽²⁾[k]+v⁽²⁾[2+k], 0≤k≤1；5 seat sequence frequencies calculate 3 seat sequence frequencies Implementation is v⁽³⁾[k]=∑_{I=0,1 ..., 3}v⁽⁵⁾[8i+k], 0≤k≤7；5 seat sequence frequencies calculate 2 seat sequence frequencies Implementation is v⁽²⁾[k]=∑_{I=0,1 ..., 7}v⁽⁵⁾[4i+k], 0≤k≤3；5 seat sequence frequencies calculate 1 seat sequence frequency Implementation is v⁽¹⁾[k]=∑_{I=0,1 ..., 15}v⁽⁵⁾[2i+k], 0≤k≤1.

Further, the multilevel iudge in the step S5, executive mode are (S5-A), calculate P value

If P-value1⁽⁵⁾>=α, P-value2⁽⁵⁾>=α, P-value1⁽²⁾>=α, P-value2⁽²⁾>=α is set up simultaneously, Then sequence to be checked passes through detection.

Further, the multilevel iudge in the step S5, executive mode can also be (S5-B), if

Meeting simultaneously, then sequence to be checked passes through detection, wherein

λ^(1,5)=max V | igamc (8, V/2) >=α },

λ^(2,5)=max V | igamc (4, V/2) >=α },

λ^(1,2)=max V | igamc (1, V/2) >=α },

λ^(2,2)=max V | igamc (0.5, V/2) >=α }.

Further, the multilevel iudge in the step S5, the λ of executive mode (S5-B)^(1,5)Retain the value of six decimals 31.999927 λ^(2,5)Retain the value 20.090235, λ of six decimals^(1,2)Retain the value 9.210340, λ of six decimals^(2,2)It protects Stay the value 6.634897 of six decimals.

By the way of the present invention is using the processing mode based on word rather than based on bit；Merge 1 to 5 seat sequence frequencies Statistical flowsheet avoids repeating same Frequency statistics process；Only 5 seat sequence frequencies of statistics, 1 to 4 seat sequence frequencies Number is directly or indirectly obtained by 5 seat sequence frequencies, avoids duplicate Frequency statistics process.The present invention have calculation scale it is small, Check high-efficient, the advantages that taking up less resources.

Detailed description of the invention

Fig. 1 is a kind of flow chart of the Fast implementation of overlapping subsequence detection；

Fig. 2 is that a kind of step S1 of the Fast implementation of overlapping subsequence detection counts the process of 5 seat sequence frequencies Figure.

Specific embodiment

The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.

Embodiment 1 inputs byte sequence in this instance, executes overlapping subsequence detection.

Referring to attached drawing 1~2, inputting sequence to be checked is byte sequence, a kind of Fast implementation of overlapping subsequence detection, It includes the following steps:

S1: 5 seat sequence frequencies of statistics: to the frequency v of 5 seat sequence of sequence statistic to be checked⁽⁵⁾[i], 0≤i≤31.Note The xth bit of word A to y bit be A_[x..y].Note word A's and B is spliced into A | | B=2^8wA+B.J=0 to n-1 is executed as follows Step:

S11: obtain location variable: by j be split as location information b, j=8wa+b, 0≤b in the location information a and word of word < 8w。

S12: 5 bit values are obtained: if b+5 < 8w, obtaining a-th of word A, takes x=A_[b:b+5]；Otherwise a-th of word is obtained A and the a+1 word B takes x=(B | | A)_[b:b+5]。

S13: frequency: v is updated⁽⁵⁾[x]=v⁽⁵⁾[x]+1。

S2: calculate 1 to 4 seat sequence frequencies: using the frequency for 5 seat sequences being calculated obtain 1,2, 3,4 seat sequence frequency v⁽¹⁾[i], 0≤i≤1；v⁽²⁾[i], 0≤i≤3；v⁽³⁾[i], 0≤i≤7；v⁽⁴⁾[i], 0≤i≤ 15.1 is directly calculated to 4 seat sequence frequencies by 5 seat sequence frequencies.5 seat sequence frequencies calculate 4 seat sequence frequencies Implementation is v⁽⁴⁾[k]=v⁽⁵⁾[k]+v⁽⁵⁾[16+k], 0≤k≤15；

The implementation that 5 seat sequence frequencies calculate 3 seat sequence frequencies is v⁽³⁾[k]=∑_{I=0,1 ..., 3}v⁽⁵⁾[8i+k], 0≤k≤7；

The implementation that 5 seat sequence frequencies calculate 2 seat sequence frequencies is v⁽²⁾[k]=∑_{I=0,1 ..., 7}v⁽⁵⁾[4i+k], 0≤k≤3；

The implementation that 5 seat sequence frequencies calculate 1 seat sequence frequency is v⁽¹⁾[k]=∑_{I=0,1 ..., 15}v⁽⁵⁾[2i+k], 0≤k≤1。

S3: intermediate variable is calculated:

S4: counting statistics value:

S5: multilevel iudge: if Meet simultaneously, then sequence to be checked passes through detection, wherein λ^(1,5)=max V | Igamc (8, V/2) >=α }, λ^(2,5)=max V | and igamc (4, V/2) >=α }, λ^(1,2)=max V | igamc (1, V/2) >=α }, λ^(2,2)=max V | igamc (0.5, V/2) >=α }；λ^(1,5)Retain the value 31.999927, λ of six decimals^(2,5)Retain six small Several values 20.090235, λ^(1,2)Retain the value 9.210340, λ of six decimals^(2,2)Retain the value 6.634897 of six decimals.With The upper only presently preferred embodiments of the present invention, is not intended to limit the invention, it is noted that all of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within spirit and principle.

Claims

1. a kind of Fast implementation of overlapping subsequence detection, which is characterized in that method includes the following steps:

S1,5 seat sequence frequencies of statistics, i.e. the frequency v to 5 seat sequence of sequence statistic to be checked⁽⁵⁾[i], 0≤i≤31, i.e. i= 0,1,2,3,...,31；

S2,1 is calculated to 4 seat sequence frequencies, i.e., obtain 1,2,3 using the frequency for 5 seat sequences being calculated Position, 4 seat sequence frequencies are respectively0≤i₁≤ 1, i.e. i=0,1；v⁽²⁾|i₂|, 0≤i₂≤ 3, i.e. i₂=0,1,2,3； v⁽³⁾[i₃], 0≤i₃≤ 7, i.e. i₃=0,1,2 ..., 7；v⁽⁴⁾[i₄], 0≤i₄≤ 15, i.e. i₄=0,1,2 ..., 15；

S5, multilevel iudge, compare statistical valueSize between threshold value, with determine to Whether inspection sequence passes through detection.

2. the Fast implementation of overlapping subsequence detection according to claim 1, which is characterized in that the step S1 system 5 seat sequence frequencies are counted, is not necessarily to sequence to be checked be bit from byte split, disposable load w-byte can be taken, be denoted as one The processing mode of a word, w take 1 or 2 or 4 or 8.

3. the Fast implementation of overlapping subsequence detection according to claim 1, which is characterized in that the step S1 system 5 seat sequence frequencies are counted, the xth bit of note word A to y bit first is A_[x..y]；Note word A's and B is spliced into A | | B=2^8wA+ B；Then following steps are executed to j=0 to n-1:

S11, location variable is obtained, j is split as location information b, j=8wa+b, 0≤b < 8w in the location information a and word of word；

S12,5 bit values of acquisition obtain a-th of word A, take x=A if b+5 < 8w_[b:b+5]；Otherwise a-th of word A and are obtained A+1 word B takes x=(B | | A)_[b:b+5]；

S13, frequency, v are updated⁽⁵⁾[x]=v⁽⁵⁾[x]+1。

4. the Fast implementation of overlapping subsequence detection according to claim 1, which is characterized in that the step S2 meter 1 is calculated to 4 seat sequence frequencies, calculation is 4 seat sequence frequencies to be calculated by 5 seat sequence frequencies, by 4 seat sequences Frequency calculates 3 seat sequence frequencies, calculates 2 seat sequence frequencies by 3 seat sequence frequencies, calculates 1 by 2 seat sequence frequencies Subsequence frequency.

5. the Fast implementation of overlapping subsequence detection according to claim 1, which is characterized in that the step S2 meter 1 is calculated to 4 seat sequence frequencies, calculation is directly to calculate 1 to 4 seat sequence frequencies by 5 seat sequence frequencies.

6. the Fast implementation of overlapping subsequence detection according to claim 4 or 5, which is characterized in that the step S2 calculates 1 to 4 seat sequence frequencies, and the implementation that 5 seat sequence frequencies calculate 4 seat sequence frequencies is v⁽⁴⁾[k]=v⁽⁵⁾[k]+v⁽⁵⁾[16+k], 0≤k≤15；

The implementation that 4 seat sequence frequencies calculate 3 seat sequence frequencies is v⁽³⁾[k]=v⁽⁴⁾[k]+v⁽⁴⁾[8+k], 0≤k≤ 7；

The implementation that 3 seat sequence frequencies calculate 2 seat sequence frequencies is v⁽²⁾[k]=v⁽³⁾[k]+v⁽³⁾[4+k], 0≤k≤ 3；

The implementation that 2 seat sequence frequencies calculate 1 seat sequence frequency is v⁽¹⁾[k]=v⁽²⁾[k]+v⁽²⁾[2+k], 0≤k≤ 1；

The implementation that 5 seat sequence frequencies calculate 3 seat sequence frequencies is v⁽³⁾[k]=∑_{I=0,1 ..., 3}v⁽⁵⁾[8i+k], 0≤k ≤7；

The implementation that 5 seat sequence frequencies calculate 2 seat sequence frequencies is v⁽²⁾[k]=∑_{I=0,1 ..., 7}v⁽⁵⁾[4i+k], 0≤k ≤3；

The implementation that 5 seat sequence frequencies calculate 1 seat sequence frequency is v⁽¹⁾[k]=∑_{I=0,1 ..., 15}v⁽⁵⁾[2i+k], 0≤k ≤1。

7. the Fast implementation of overlapping subsequence detection according to claim 1, which is characterized in that in the step S5 Multilevel iudge, executive mode is, calculates P value

If P-value1⁽⁵⁾>=α, P-value2⁽⁵⁾>=α, P-value1⁽²⁾>=α, P-value2⁽²⁾>=α is set up simultaneously, then to Inspection sequence passes through detection.

8. the Fast implementation of overlapping subsequence detection according to claim 1, which is characterized in that in the step S5 Multilevel iudge, executive mode is, if Meeting simultaneously, then sequence to be checked passes through detection, wherein

λ^(1,5)=max V | igamc (8, V/2) >=α },

λ^(2,5)=max V | igamc (4, V/2) >=α },

λ^(1,2)=max V | igamc (1, V/2) >=α },

λ^(2,2)=max V | igamc (0.5, V/2) >=α }.

9. the Fast implementation of overlapping subsequence detection according to claim 8, which is characterized in that wherein λ^(1,5)Retain The value 31.999927, λ of six decimals^(2,5)Retain the value 20.090235, λ of six decimals^(1,2)Retain the value of six decimals 9.210340 λ^(2,2)Retain the value 6.634897 of six decimals.