CN117155531A

CN117155531A - Deep learning side channel attack method and system based on CLRM model

Info

Publication number: CN117155531A
Application number: CN202311093318.4A
Authority: CN
Inventors: 黄海; 唐新琳; 刘志伟; 马超; 关志博; 吴英东
Original assignee: Zhongshu Shenzhen Times Technology Co ltd; Harbin University of Science and Technology
Current assignee: Zhongshu Shenzhen Times Technology Co ltd; Harbin University of Science and Technology
Priority date: 2023-08-28
Filing date: 2023-08-28
Publication date: 2023-12-01

Abstract

The invention discloses a deep learning side channel attack method and a system based on a CLRM model, relates to the technical field of side channel attack, and aims to solve the problems that a large amount of energy tracks are needed in the existing side channel attack method, the model efficiency is low, the model has rapid fitting exceeding and gradient vanishing during training, and the like. The method comprises the following steps: s1, acquiring side channel data generated by a cryptographic algorithm in the running process of equipment; s2, constructing a side channel attack model, wherein the model comprises a convolutional neural network module, a long-term and short-term memory network module and a residual error network module, and the side channel attack model is trained by using side channel data of a known key so as to establish a mapping relation between the side channel data and a correct key; and S3, performing feature extraction analysis on the acquired side channel data by adopting the trained side channel attack model so as to realize correct recovery of the secret key. The method can recover the correct key by using a small number of energy trace tracks, and has remarkable advantage of attack efficiency.

Description

Deep learning side channel attack method and system based on CLRM model

Technical Field

The invention relates to the technical field of side channel attack, in particular to a deep learning side channel attack method and system based on a CLRM model.

Background

With the development of deep learning technology and the current stage of the rapid development of computer technologies such as mobile internet, cloud computing, big data and the like, various devices such as wearable devices, intelligent terminals, embedded devices and the like have become an indispensable part of life, but at the same time, the cryptographic algorithm is threatened, the attack means for generating the greatest threat is side channel attack (Side Channel Attack, SCA), which essentially is to capture physical information related to a secret key by using an instrument, such as key information such as sound, electromagnetic radiation, power consumption loss and the like released in the encryption process, and then analyze the relationship between the physical leakage information and the intermediate state of the encrypted data in the encryption process, so that the secret key can be cracked at low cost.

In addition, in the side channel attack field, templates can be classified into two categories according to whether bypass information is constructed: modeling class side channel attacks and non-modeling class side channel attacks. The modeling type side channel attack comprises a modeling stage and an attack stage, which correspond to a training stage and a prediction stage in deep learning. In the modeling stage, an attacker uses modeling equipment which is the same as the target equipment to model the dependency relationship between the information related to the key and bypass leakage of the target equipment; in the Attack stage, a key recovery Attack, such as a Template Attack (TA) and a random Attack (Stochastic Attack, SA), is performed on the target device by using the established model, and the Attack belongs to a modeling class side channel Attack.

At present, the deep learning algorithm has obtained a great deal of experimental verification in the side channel field and has obtained good research results. The magtrebi et al propose for the first time that deep learning is applied to side channel attacks, proving that side channel attacks based on deep learning can effectively attack AES algorithm implementations that do not employ mask protection. The most commonly used deep neural network is a convolutional neural network, which has spatial invariance and can naturally resist countermeasures of track misalignment. Moreover, the internal structure of convolutional neural networks enables them to automatically extract features from high-dimensional data, alleviating the need for data preprocessing to some extent. However, a large amount of energy traces are still needed for successful side channel attack based on deep learning, the model efficiency is required to be further improved, and the problems of rapid over-fitting, gradient disappearance and the like exist in the model during training.

Disclosure of Invention

The invention aims to solve the technical problems that:

the existing side channel attack method based on deep learning needs a large amount of energy traces, has low model efficiency, and has the problems of rapid overfitting, gradient disappearance and the like when the model is trained.

The invention adopts the technical scheme for solving the technical problems:

the invention provides a deep learning side channel attack method based on a CLRM Model, wherein the CLRM Model is abbreviated as CNN Model, LSTM Model and ResNet Model, and the method comprises the following steps:

s1, acquiring side channel data generated by a cryptographic algorithm in the running process of equipment;

s2, constructing a side channel attack model, wherein the model comprises a convolutional neural network module, a long-term and short-term memory network module and a residual error network module, and the side channel attack model is trained by using side channel data of a known key so as to establish a mapping relation between the side channel data and a correct key;

and S3, performing feature extraction analysis on the acquired side channel data by adopting the trained side channel attack model so as to realize correct recovery of the secret key.

Further, the convolutional neural network module is configured to extract characteristics of the leakage information, and includes 3 convolutional blocks, where each convolutional block includes: convolution layer, batch normalization processing layer, seLU activation function layer and MaxPooling layer.

Further, the long-term and short-term memory network module is used for learning different characteristic information, and comprises 2 CuDNNLSTM layers, a batch normalization layer and an activation function layer.

Further, the residual network module includes 2 batch normalization processing layers, 2 SeLU activation function layers, 2 convolution layers, and a soft threshold module.

Further, the CLRM model specific architecture includes:

input layer: the first layer is an Input layer, namely an Input energy trace vector;

the convolutional neural network module includes 3 convolutional blocks:

convolution block 1: the second layer is a convolution layer Conv, a batch normalization layer BN, an activation function layer SeLU and a maximum pooling layer MaxPooling; the filter size of the convolution layer is 3, the step length is 2, the number is 128, and the filling mode is same; the pooling window of the maximum pooling layer is 2, and the step length is 2;

convolution block 2: the third layer is a convolution layer Conv, a batch normalization layer BN, an activation function layer SeLU and a maximum pooling layer MaxPooling; wherein the filter size of the convolution layer is 3, the step length is 2, the number is 256, the filling mode is same, the pooling window of the maximum pooling layer is 2, and the step length is 2;

convolution block 3: the fourth layer is a convolution layer Conv, a batch normalization layer BN, an activation function layer SeLU and a maximum pooling layer MaxPooling; wherein the filter size of the convolution layer is 3, the step length is 2, the number is 512, the filling mode is same, the pooling window of the maximum pooling layer is 2, and the step length is 2;

long-term memory network module: the fifth layer comprises two layers of forward and backward propagating CuDNNlSTMs, the number of forward propagating units is 128, the number of backward propagating units is 128, and the forward and backward propagating CuDNNLSTMs are respectively connected after being processed by a batch normalization layer and an activation function layer;

maximum pooling layer: the sixth layer is a MaxPooling layer; the pooling window is 2, and the step length is 2;

residual network module: the seventh layer comprises two batch standardization processing layers, two SeLU activation function layers, 256 convolution layers and a soft threshold module, wherein the convolution kernel size of the convolution layers is 3 and the step length is 2;

batch normalization layer: the eighth layer is BN layer;

activating a function layer: the ninth layer is a SeLU activation function layer;

global average pooling layer: the tenth layer is a GAP layer;

data flattening layer: the eleventh layer is a flat layer;

full tie layer 1: the twelfth layer is a Dense layer, and the activation function is a ReLU;

full tie layer 2: the tenth layer is a Dense layer, and the activation function is a ReLU;

reject layer: the fourteenth layer is a Dropout layer;

output layer: the tenth layer is a Prediction layer, the activation function is softmax, and probability distribution of the category is output.

Further, the soft threshold module in the residual error network module sets the characteristic information with the absolute value lower than the threshold value as zero, and also contracts and approaches the characteristic information with the absolute value higher than the threshold value to zero, so that the redundant characteristic information is eliminated.

Further, the evaluation criteria of the side channel model comprise Accuracy Accurcy, loss and Rank order Rank of the model, the Accuracy Accurcy and Loss rate Loss are adopted to evaluate the proportion of sensitive information extracted from the side channel information by the model, and Rank is adopted to evaluate the Rank order of correct keys in all possible keys in the side channel attack model.

Further, the Rank evaluation criterion:

rank(correct_key)＝{i|sort(d _k )[i]＝d _k [correct_key]}

wherein d _k Is a likelihood function value corresponding to each guess key, the sort () is a descending function, i, k e {0,1, …, N }, N is the number of element tensors.

The deep learning side channel attack system based on the CLRM model comprises a program module corresponding to the steps of any one of the technical schemes, and the steps in the deep learning side channel attack method based on the CLRM model are executed in running.

A computer readable storage medium storing a computer program configured to implement the steps of the CLRM model-based deep learning side channel attack method according to any one of the above technical solutions when called by a processor.

Compared with the prior art, the invention has the beneficial effects that:

the invention discloses a deep learning side channel attack method based on a CLRM model, which designs a CLRM network model integrated with a convolutional neural network module, a long-term and short-term memory network module and a residual network module. Firstly, the convolutional neural network module can extract characteristic information, so that the model does not need to carry out additional pretreatment on energy traces when carrying out side channel attack; secondly, the long-term memory network module is directly connected behind the convolutional neural network module, so that the model can extract features in space and time sequence, sensitive information in side channel information can be more accurately captured, and the problem that the gradient of the model disappears is prevented; and finally, the long-term and short-term memory network module is connected with the residual network module after passing through the maximum pooling layer, so that the redundant characteristic information of the model is eliminated, the characteristic dimension is reduced, the training speed of the model is accelerated, and the phenomenon of overfitting of the model is prevented.

Experiments prove and model evaluation show that the model has strong capability of extracting and processing leakage information characteristics in space and time sequence, avoids over fitting and gradient disappearance, realizes that a correct key can be recovered by using a small number of energy trace tracks, and has remarkable advantages in attack efficiency.

Drawings

FIG. 1 is a flow chart of a deep learning side channel attack method based on a CLRM model in an embodiment of the present invention;

fig. 2 is a schematic diagram of the overall structure of a deep learning side channel attack network based on a CLRM model in an embodiment of the present invention;

FIG. 3 is a schematic diagram of a deep learning side channel attack network parameter based on a CLRM model in an embodiment of the present invention;

FIG. 4 is a schematic diagram of a convolutional neural network module in an embodiment of the present invention;

FIG. 5 is a schematic diagram of a long-term and short-term memory network module according to an embodiment of the present invention;

fig. 6 is a schematic diagram of a residual network module in an embodiment of the present invention;

FIG. 7 is a graph of the performance effects of training and attack of the CLRM model in an embodiment of the present invention; wherein, the graph (a) is a performance effect graph of training and attacking of the CLRM model on the ASCAD data set, and the graph (b) is a performance effect graph of training and attacking of the CLRM model on the DPA-cotest v4 data set.

Detailed Description

In the description of the present invention, it should be noted that the terms "first," "second," and "third" mentioned in the embodiments of the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", or a third "may explicitly or implicitly include one or more such feature.

In order that the above objects, features and advantages of the invention will be readily understood, a more particular description of the invention will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings.

The specific embodiment I is as follows: referring to fig. 1 to 6, the present invention provides a deep learning side channel attack method based on CLRM model, comprising the following steps:

And a specific embodiment II: as shown in fig. 4, the convolutional neural network module is configured to extract features of leakage information, and includes 3 convolutional blocks, where each convolutional block includes: convolution layer, batch normalization processing layer, seLU activation function layer and MaxPooling layer. The other embodiments are the same as those of the first embodiment.

And a third specific embodiment: as shown in fig. 5, the long-short-term memory network module is used for learning different feature information, including 2 cudnstm layers, a batch normalization layer and an activation function layer. This embodiment is otherwise identical to the second embodiment.

And a specific embodiment IV: as shown in fig. 6, the residual network module includes 2 batch normalization processing layers, 2 SeLU activation function layers, 2 convolution layers, and a soft threshold module. This embodiment is otherwise identical to the third embodiment.

Fifth embodiment: as shown in fig. 2 and 3, the CLRM model specific architecture includes:

the convolutional neural network module includes 3 convolutional blocks:

convolution block 2: the third layer is a convolution layer Conv, a batch normalization layer BN, an activation function layer SeLU and a maximum pooling layer MaxPooling; the filter size of the convolution layer is 3, the step size is 2, the number is 256, the filling mode is same, the pooling window of the maximum pooling layer is 2, and the step size is 2;

residual network module: the seventh layer comprises two batch standardization processing layers, two SeLU activation function layers, 256 convolution layers and a soft threshold module, wherein the convolution kernel size of the convolution layers is 3, the step length is 2, and an identity mapping layer;

batch normalization layer: the eighth layer is a BN layer, so that the internal covariate offset can be solved, and the gradient saturation problem can be relieved;

activating a function layer: the ninth layer is a SeLU activation function layer, has the characteristic of self normalization, and can avoid gradient disappearance and explosion;

global average pooling layer: the tenth layer is a GAP layer, which not only can greatly reduce network parameters, but also can avoid the phenomenon of overfitting;

data flattening layer: the eleventh layer is a flat layer, so that the number of model parameters is reduced, and the phenomenon of overfitting is avoided;

full tie layer 1: the twelfth layer is a Dense layer, the number of units is 4096, and the activation function is ReLU;

full tie layer 2: the tenth layer is a Dense layer, the number of units is 4096, and the activation function is ReLU;

reject layer: the fourteenth layer is a Dropout layer, the rejection probability is 0.4, and the model is prevented from being fitted;

output layer: the tenth layer is a Prediction layer, the number of units is 256, the activation function is softmax, and the probability distribution of the category is output. This embodiment is otherwise identical to the fourth embodiment.

In the embodiment, firstly, the convolutional neural network is used for automatically extracting the characteristics of the leakage information, the self-normalized SeLU activation function is used for avoiding gradient elimination and explosion, and the average pooling layer is used for further screening the extracted characteristics; secondly, the problem of gradient attenuation generated by gradual reduction in the gradient reverse process can be solved by using a long-term and short-term memory network, and the operation on the GPU can be realized by using the CuDNNLSTM, so that the training model time is greatly reduced; thirdly, the integrity of extraction and screening of the leakage information can be protected by utilizing a residual error network, and the problem of disappearance of the update gradient can be solved; and global average pooling is also used, so that network parameters can be greatly reduced, and the phenomenon of over-fitting can be avoided.

Specific embodiment six: and the soft threshold module in the residual error network module sets the characteristic information with the absolute value lower than the threshold value as zero, and also contracts and approaches the characteristic information with the absolute value higher than the threshold value to zero, so that the redundant characteristic information is eliminated. This embodiment is otherwise identical to the fourth embodiment.

In this embodiment, the soft threshold module is based on the following calculation formula:

where x represents an input feature, y represents an output feature, τ represents a threshold value, and is a positive number with a small value. If the threshold is greater than the absolute value of all the input signature information, then the output signature y can only be zero.

Specific embodiment seven: the evaluation criteria of the side channel model comprise Accuracy Accurcy, loss and Rank order Rank of the model, the Accuracy Accurcy and Loss are adopted to evaluate the proportion of sensitive information extracted from the side channel information by the model, and Rank is adopted to evaluate the Rank order of correct keys in all possible keys in the side channel attack model. The other embodiments are the same as those of the first embodiment.

Specific embodiment eight: the Rank evaluation criteria:

rank(correct_key)＝{i|sort(d _k )[i]＝d _k [correct_key]}

wherein d _k Is a likelihood function value corresponding to each guess key, the sort () is a descending function, i, k e {0,1, …, N }, N is the number of element tensors. This embodiment is otherwise identical to embodiment seven.

In the modeling stage, an attacker describes physical leakage by acquiring a copy of target equipment, and defines that the attacker acquires N _p Bar energy trace, denoted as set X _pr o _filing ＝{x _i |i＝1,2,…,N _p Each energy trace x _i Corresponding to the known key k ^* Intermediate value v of lower encryption operation _i ＝f(p _i ,k ^* ) So that an attacker can build a set according to the modelingAnd constructing a model, and predicting the probability of the model as shown in a formula (1).

In the attack stage, an attacker uses the model to recover the key of the target device. Definition attacker acquisition N _a (N _a <N _p ) Bar energy trace, denoted as set X _attack ＝{x _i |i＝1,2,…,N _p } because of X _attack And X _pr o _filing Are independent of each other, so each energy trace x _i Corresponding to a fixed unknown key k ^* Therefore, an attacker can calculate the posterior probability of the intermediate value corresponding to the guess key k of each energy trace according to the Bayesian theorem, as shown in a formula (2); then using maximum likelihood criterion to treat each energy trace collected under real condition as mutually independent, calculating likelihood function value d correspondent to each guess key _k As shown in formula (3); finally, solving maximum likelihood estimation value of kTrace number N along with attack energy _a Increase of->Eventually equal to the correct key k ^* As shown in equation (4).

Pr[x|V＝v]

Embodiment nine: a CLRM model-based deep learning side channel attack system having program modules corresponding to the steps of any one of the above embodiments one to eight, the steps in the CLRM model-based deep learning side channel attack method described above being executed at run-time.

Specific embodiment ten: a computer readable storage medium storing a computer program configured to implement the steps of the CLRM model-based deep learning side channel attack method according to any one of embodiments one to eight when called by a processor.

Example 1

The embodiment adopts an ASCAD data set and a DPA-cotest v4 data set to verify the performance of the CLRM model, wherein the ASCAD data set is a collection of electromagnetic energy traces acquired by an electromagnetic probe when an AES-128 algorithm runs on ATMega8515 target equipment. The DPA-cotest v4 dataset was collected by the AES-256 algorithm on an 8-bit ATMega163 smart card controlled by a SASEBO-W plate.

Based on the ASCAP dataset and the DPA-control v4 dataset, an Adam optimizer was employed, batch size was set to 128, learning rate was set to 0.0001, and a categorical cross-entropy loss function was used.

As shown in fig. 7, the Accuracy and Loss evaluation model is used to extract the proportion of sensitive information from the side channel information, and Rank is used to evaluate the ordering of the correct keys among all possible keys in the side channel attack model. The Loss of the CLRM model on the ASCAD data set can be reduced to 0.3925, the Accuracy can reach 0.9013, the model is not fitted, and the Rank is reduced to 0 by using only 40 energy trace numbers, namely the correct secret key is recovered. The CLRM model can drop Loss to 0.3127 on the DPA-coherent v4 dataset, accuracy can reach 0.9282, and the model uses only 2 energy trace numbers to drop Rank to 0, i.e. recover the correct key.

Will be Benadjila et al 2019 [1 ]]Proposed CNN_best model and MLP_best model, zaid et al 2020 ^[2] Proposed Zaid model and 2022, zheng Dong et al ^[3] The proposed CBAPD model is compared with the CLRM model, and the attack effect comparison is shown in the following table:

in the table "-" indicates that the model cannot recover the correct key with the same finite number of energy traces.

Both the ASCAD data set and the DPA-coherent v4 data set are the number of energy trace tracks required by the comparison model to drop Rank to 0 after 100 cycles epochs, i.e., the number of energy trace tracks required to successfully recover the correct key. The energy trace number required by the CLRM model for successfully recovering the correct key on the ASCAD data set and the DPA-cotest v4 data set is 40 and 2 respectively, and compared with other models, the attack performance of the CLRM model is further improved.

Although the present disclosure is disclosed above, the scope of the present disclosure is not limited thereto. Various changes and modifications may be made by one skilled in the art without departing from the spirit and scope of the disclosure, and such changes and modifications would be within the scope of the disclosure.

The literature cited in this example is:

[1]Benadjila R,Prouff E,Strullu R,et al.Deep learning for side-channel analysis and introduction

to ASCAD database[J].Journal of Cryptographic Engineering,2020,10(2):163-188.doi:

https://doi.org/10.1007/s13389-019-00220-8.

[2]Zaid G,Bossuet L,Habrard,et al.Methodology for efficient CNN architectures in profiling

attacks[J].IACR Transactions on Cryptographic Hardware and Embedded Systems,2020,

2020(1):1–36.doi:https://doi.org/10.13154/tches.v2020.i1.1-36.

[3] zheng Dong, li Yaning, zhang Meiling. Side channel attack based on CBAPD networks. Cryptographic report. 2022,9 (2): 308-321.

doi:https://doi.org/10.13868/j.cnki.jcr.000521.

ZHENG Dong,LI Ya-Ning,ZHANG Mei-Ling.Side-channel Attacks Based on CBAPDNetwork.Journal of Cryptologic Research.2022,9(2):308-321.doi:

https://doi.org/10.13868/j.cnki.jcr.000521.

Claims

1. The deep learning side channel attack method based on the CLRM model is characterized by comprising the following steps:

2. The CLRM model-based deep learning side channel attack method according to claim 1, wherein said convolutional neural network module is configured to extract features of leakage information, and comprises 3 convolutional blocks, each comprising: convolution layer, batch normalization processing layer, seLU activation function layer and MaxPooling layer.

3. The CLRM model-based deep learning side channel attack method according to claim 2, wherein said long-short term memory network module is used for learning different characteristic information, including 2 cudnstm layers, a batch normalization layer and an activation function layer.

4. The CLRM model-based deep learning side channel attack method as claimed in claim 3, wherein said residual network module comprises 2 batch normalization processing layers, 2 SeLU activation function layers, 2 convolution layers and a soft threshold module.

5. The CLRM model-based deep learning side channel attack method as claimed in claim 4, wherein said CLRM model specific architecture comprises:

the convolutional neural network module includes 3 convolutional blocks:

batch normalization layer: the eighth layer is BN layer;

global average pooling layer: the tenth layer is a GAP layer;

data flattening layer: the eleventh layer is a flat layer;

reject layer: the fourteenth layer is a Dropout layer;

6. The CLRM model-based deep learning side channel attack method according to claim 4, wherein the soft threshold module in the residual network module sets the feature information with an absolute value lower than the threshold to zero, and also contracts and approaches the feature information with an absolute value greater than the threshold to zero, so as to eliminate redundant feature information.

7. The CLRM model-based deep learning side channel attack method according to claim 1, wherein the evaluation criteria of the side channel model include Accuracy, loss and Rank order Rank of the model, the sensitive information proportion extracted from the side channel information using Accuracy and Loss evaluation model is recovered from the Rank evaluation side channel attack model, and the Rank is used to evaluate the Rank order of the correct key among all possible keys.

8. The CLRM model-based deep learning side channel attack method according to claim 7, wherein said Rank evaluation criterion:

rank(correct_key)＝{i|sort(d _k )[i]＝d _k [correct_key]}

9. A CLRM model-based deep learning side channel attack system, characterized in that the system has program modules corresponding to the steps of any of the preceding claims 1 to 8, the steps of the CLRM model-based deep learning side channel attack method being executed at run-time.

10. A computer readable storage medium, characterized in that it stores a computer program configured to implement the steps of the CLRM model-based deep learning side channel attack method according to any one of claims 1 to 8 when called by a processor.