CN117375792A

CN117375792A - Method and device for detecting side channel

Info

Publication number: CN117375792A
Application number: CN202310209869.6A
Authority: CN
Inventors: 张贺; 李屹; 刘彬
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2023-02-24
Filing date: 2023-02-24
Publication date: 2024-01-09

Abstract

The embodiment of the application provides a method and a device for detecting a side channel, wherein the method comprises the following steps: acquiring attribute information of first sensitive data, wherein the attribute information is marked on the first sensitive data; determining whether the first sensitive data satisfies a side channel security constraint based on attribute information of the first sensitive data, the side channel including one or more of a branch execution time side channel, a variable time function side channel, a cache time side channel, and a variable time instruction side channel; if the first sensitive data does not meet the side channel security constraint, error information is recorded in a compiled log. In the method, the sensitive data in the source code is marked with the attribute information, and based on the attribute information, a plurality of side channels in the compiling process can be accurately and comprehensively detected, so that the side channel risk in the compiling process can be rapidly and accurately identified, the detection completeness is improved, and the problem of side channel risk misreporting and missing reporting is solved to a great extent.

Description

Method and device for detecting side channel

Technical Field

The embodiment of the application relates to the field of side channel detection, and more particularly relates to a method and a device for side channel detection.

Background

Side channel security of products is becoming more and more important in the current large background where users are more and more paying attention to personal sensitive information.

Therefore, how to perform side channel detection more accurately and more completely to ensure the security of sensitive information and further improve the security of products is important.

Disclosure of Invention

The application provides a method and a device for detecting side channels, in the method, sensitive data in source codes are marked with attribute information, and based on the attribute information, a plurality of side channels in the compiling process can be accurately and comprehensively detected, so that the risk of the side channels in the compiling process can be rapidly and accurately identified, the detection completeness is improved, and the problem of false reporting and missing reporting of the side channel risk is solved to a great extent.

In a first aspect, a method for side channel detection is provided, the method comprising: acquiring attribute information of first sensitive data, wherein the attribute information is marked on the first sensitive data; determining whether the first sensitive data satisfies a side channel security constraint based on attribute information of the first sensitive data, the side channel including one or more of a branch execution time side channel, a variable time function side channel, a cache time side channel, and a variable time instruction side channel; if the first sensitive data does not meet the side channel security constraint, error information is recorded in a compiled log.

Optionally, the error information includes error location and/or constraint violation details.

Optionally, when it is determined that the first sensitive data does not meet the branch execution time side channel security constraint, the first error information is recorded in the compiled log.

Optionally, the first error information includes error location information: the code line of the marked sensitive data in the conditional statement information can further comprise violation constraint details: the branch execution time side channel presents a security risk.

Optionally, when it is determined that the first sensitive data does not meet the variable time function side channel security constraint, a second error information is recorded in the compiled log.

Optionally, the second error information includes error location information: the marked sensitive data is located in the code line of the parameter list of the calling function, and the marked sensitive data can further comprise violation constraint details: the variable time function side channel presents a security risk.

Optionally, when it is determined that the first sensitive data does not meet the cache time side channel security constraint, a third error information is recorded in the compiled log.

Optionally, the third error information includes error location information: the code line where the marked sensitive data of the pointer offset access data in the pointer calculation address offset expression is located may further include violation constraint details: buffering time side channels presents a security risk.

Optionally, when it is determined that the first sensitive data does not meet the variable time instruction side channel security constraint, a fourth error information is recorded in the compiled log.

Optionally, the fourth error information includes error location information: the code line where the marked sensitive data is located in the operand of the corresponding operation of the variable time instruction can also include violation constraint details: the variable time instruction side channel presents a security risk.

Alternatively, sensitive data may also be understood as non-public data or private data, such as personal data of the user, internal data of an organization, keys, etc.

In the embodiment of the application, the sensitive data in the source code is marked with the attribute information, and based on the attribute information, a plurality of side channels in the compiling process can be accurately and comprehensively detected, so that the side channel risk in the compiling process can be rapidly and accurately identified, the completeness of detection is improved, and the problem of side channel risk misreporting and missing reporting is solved to a great extent.

With reference to the first aspect, in one possible implementation manner, before acquiring attribute information of the first sensitive data, the method further includes: the first sensitive data is marked with the attribute information in source code.

Optionally, before marking the attribute information on the first sensitive data, the method may further include the steps of: and registering the attribute of the first sensitive data.

In the embodiment of the application, attribute marking is performed on sensitive data in source code, and an execution basis is provided for subsequent steps (such as security constraint detection based on the attribute information, desensitization mechanism judgment based on the attribute information, type reasoning based on the attribute information, and the like).

With reference to the first aspect, in one possible implementation manner, determining, based on attribute information of the first sensitive data, whether the first sensitive data meets a side channel security constraint includes: determining whether the first sensitive data satisfies a desensitization condition; if the first sensitive data does not meet the desensitization condition, determining whether the first sensitive data meets the side channel security constraint based on the attribute information of the first sensitive data.

In the embodiment of the application, before the detection of the side channel security constraint, whether the sensitive data meets the desensitization condition is determined, and the detection of the side channel security constraint is performed only when the sensitive data does not meet the desensitization condition, so that the false alarm of the side channel risk can be effectively avoided.

With reference to the first aspect, in a possible implementation manner, the method further includes: type reasoning is performed on the first sensitive data.

It can be understood that: the above type of reasoning refers to the transfer of effects on the first sensitive data during compilation.

In the embodiment of the application, the type automatic reasoning of the sensitive data can be realized by utilizing the transmissibility of the marking attribute of the sensitive data, so that the influence of the sensitive data is transmitted in the whole compiling process, the completeness of side channel detection can be improved, and the tool usability in the detection process can be improved.

With reference to the first aspect, in one possible implementation manner, determining, based on attribute information of the first sensitive data, whether the first sensitive data meets a side channel security constraint includes: acquiring conditional statement information stored in the compiled intermediate representation; determining whether the first sensitive data exists in the conditional statement information based on attribute information of the first sensitive data; and if the first sensitive data exists in the conditional statement information, determining that the first sensitive data does not meet the branch execution time side channel security constraint.

In the embodiment of the application, a method for detecting the risk of the branch execution time side channel is provided, and based on attribute information of sensitive data, whether the risk of the branch execution time side channel exists or not is determined in the compiling process, so that the branch execution time side channel in the compiling process can be accurately and comprehensively detected, further, the risk of the branch execution time side channel in the compiling process can be rapidly and accurately identified, and further, the security of sensitive data or keys of users and the like can be improved.

With reference to the first aspect, in one possible implementation manner, determining, based on attribute information of the first sensitive data, whether the first sensitive data meets a side channel security constraint includes: acquiring calling information of a calling function in a compiling process, wherein the calling information of the calling function comprises the name of the calling function and a parameter list of the calling function; determining whether the calling function is a variable time function according to the name of the calling function; if the calling function is a variable time function, determining whether the first sensitive data exists in a parameter list of the calling function based on attribute information of the first sensitive data; and if the first sensitive data exists in the parameter list of the calling function, determining that the first sensitive data does not meet the variable time function side channel security constraint.

In the embodiment of the application, the method for detecting the risk of the variable time function side channel is provided, and based on attribute information of sensitive data, whether the risk of the variable time function side channel exists or not is determined in the compiling process, so that the variable time function side channel in the compiling process can be accurately and comprehensively detected, the risk of the variable time function side channel in the compiling process can be rapidly and accurately identified, and further safety of sensitive data or keys of users and the like can be improved.

With reference to the first aspect, in one possible implementation manner, determining, based on attribute information of the first sensitive data, whether the first sensitive data meets a side channel security constraint includes: obtaining dereferencing statement information stored in the compiled intermediate representation, the dereferencing statement information comprising a pointer calculation address offset expression; determining whether the first sensitive data is pointer offset access data in the pointer calculation address offset expression based on attribute information of the first sensitive data; and if the first sensitive data is determined to be used as pointer offset access data in the pointer calculation address offset expression, determining that the first sensitive data does not meet the buffer time side channel security constraint.

In the embodiment of the application, a method for detecting risk of replacing a cache time side channel is provided, and based on attribute information of sensitive data, whether the risk of the cache time side channel exists or not is determined in the compiling process, so that the cache time side channel in the compiling process can be accurately and comprehensively detected, the risk of the cache time side channel in the compiling process can be rapidly and accurately identified, and further safety of sensitive data or secret keys of users and the like can be improved.

With reference to the first aspect, in a possible implementation manner, the determining, based on attribute information of the first sensitive data, whether the first sensitive data meets a side channel security constraint includes: determining a CPU architecture of a central processing unit when a program is compiled; determining a variable time instruction corresponding to the CPU architecture; determining whether the first sensitive data exists in an operand of the variable time instruction corresponding operation based on attribute information of the first sensitive data; if the first sensitive data is determined to exist in the operand of the variable time instruction corresponding operation, the first sensitive data is determined to not meet the variable time instruction side channel security constraint.

In the embodiment of the application, the method for detecting the risk of the variable time instruction side channel is provided, and based on attribute information of sensitive data, whether the risk of the variable time instruction side channel exists or not is determined in the compiling process, so that the variable time instruction side channel in the compiling process can be accurately and comprehensively detected, the risk of the variable time instruction side channel in the compiling process can be rapidly and accurately identified, and further safety of sensitive data or keys of users and the like can be improved.

With reference to the first aspect, in one possible implementation manner, determining whether the first sensitive data meets a desensitization condition includes: constructing a control flow graph consisting of nodes of each branch of the program; when the execution depth of each branch is consistent and each operation of each branch is equivalent, determining that the first sensitive data meets a desensitization condition; or, when the execution depths of the branches are not consistent and/or each step of operation of the branches is not equivalent, determining that the first sensitive data does not meet the desensitization condition.

In the embodiment of the application, a method for determining whether sensitive data meets a desensitization condition is provided, in the method, when a compiler meets an equivalence condition, the sensitive data is determined to meet the desensitization condition, and a basis is provided for determining whether security constraint detection is needed or not later.

With reference to the first aspect, in one possible implementation manner, determining whether the first sensitive data meets a desensitization condition includes: obtaining a first list in a program configuration file, the first list comprising one or more obfuscation operations; determining that the first sensitive data satisfies a desensitization condition when the first sensitive data is processed by a obfuscation operation in the first list; or, when the first sensitive data is not processed by the obfuscation operation in the first list, determining that the first sensitive data does not satisfy a desensitization condition.

In the embodiment of the application, a method for determining whether the sensitive data meets the desensitization condition is provided, in the method, when the sensitive data is processed by the confusion operation in the program configuration file, the sensitive data is confused with other data and considered to be treated equally, so that the sensitive data is determined to meet the desensitization condition, and a basis is provided for determining whether the security constraint detection is needed or not subsequently.

With reference to the first aspect, in one possible implementation manner, performing type reasoning on the first sensitive data includes: determining whether a first variable in a program is sensitive or not according to the type rule of the expression and the subtype inference rule of the variable based on the attribute information of the first sensitive data, wherein the first variable refers to the variable except the first sensitive data in the program; when the first variable is determined to be sensitive, an attribute tag of the first sensitive data is passed to the first variable.

Alternatively, if the first variable has an operational relationship with the sensitive data, such as an assignment operation, it may be determined that the first variable is affected by the sensitive data, and the attribute flag of the sensitive data is further transferred to the first variable.

In the embodiment of the application, the method for type reasoning of the sensitive data is provided, and based on attribute information of the sensitive data and the relation between other variables and the sensitive data, whether the other variables are sensitive is determined, and when the other variables are determined to be influenced by the sensitive data, the attribute of the sensitive data is transmitted to the other variables, so that when safety constraint detection is carried out, the sensitive data can be detected, and meanwhile, the other variables influenced by the sensitive data can be detected, so that the detection process is more complete, and the safety of the sensitive data or keys of users is further improved.

In a second aspect, there is provided a method of data processing, the method comprising: acquiring attribute information of first data; determining whether the attribute information exists in an information base, wherein the information base comprises at least one of the following: compiling conditional statement information stored in the intermediate representation, a parameter list of the calling function, and an operand of the corresponding operation of the dereferencing statement information or the variable time instruction stored in the intermediate representation; and when the attribute information exists in the first information base, alarming.

With reference to the second aspect, in one possible implementation manner, before acquiring attribute information of the first data, the method further includes: the first data is marked with the attribute information in the source code.

With reference to the second aspect, in one possible implementation manner, determining whether the attribute information exists in the information base includes: determining whether the first data satisfies a desensitization condition; if the first data does not satisfy the desensitization condition, determining whether the attribute information exists in the information base.

With reference to the second aspect, in a possible implementation manner, the method further includes: type reasoning is performed on the first data.

With reference to the second aspect, in a possible implementation manner, the determining whether the attribute information exists in the information base includes: acquiring conditional statement information stored in the compiled intermediate representation; it is determined whether the attribute information exists in the conditional statement information.

With reference to the second aspect, in one possible implementation manner, determining whether the attribute information exists in the information base includes: acquiring calling information of a calling function in a compiling process, wherein the calling information of the calling function comprises the name of the calling function and a parameter list of the calling function; determining whether the calling function is a variable time function according to the name of the calling function; if the calling function is a variable time function, determining whether the attribute information exists in a parameter list of the calling function.

With reference to the second aspect, in a possible implementation manner, the determining whether the attribute information exists in the information base includes: obtaining dereferencing statement information stored in the compiled intermediate representation, the dereferencing statement information comprising a pointer calculation address offset expression; it is determined whether the attribute information is present in pointer offset access data in the pointer calculation address offset expression.

With reference to the second aspect, in a possible implementation manner, the determining whether the attribute information exists in the information base includes: determining a CPU architecture of a central processing unit when a program is compiled; determining a variable time instruction corresponding to the CPU architecture; it is determined whether the attribute information is present in an operand of the variable time instruction corresponding operation.

With reference to the second aspect, in a possible implementation manner, the determining whether the first data meets a desensitization condition includes: constructing a control flow graph consisting of nodes of each branch of the program; when the execution depth of each branch is consistent and each operation of each branch is equivalent, determining that the first data meets a desensitization condition; or, when the execution depths of the branches are not consistent and/or each step of operation of the branches is not equivalent, determining that the first data does not meet the desensitization condition.

With reference to the second aspect, in one possible implementation manner, determining whether the first data meets a desensitization condition includes: obtaining a first list in a program configuration file, the first list comprising one or more obfuscation operations; determining that the first data satisfies a desensitization condition when the first data is processed by a confusion operation in the first list; or, when the first sensitive data is not processed by the obfuscation operation in the first list, determining that the first data does not satisfy a desensitization condition.

With reference to the second aspect, in one possible implementation manner, performing type reasoning on the first data includes: determining whether a first variable in a program is sensitive or not according to the type rule of the expression and the subtype inference rule of the variable based on the attribute information of the first data, wherein the first variable refers to the variable except the first data in the program; when it is determined that the first variable is sensitive, an attribute tag of the first data is passed to the first variable.

In a third aspect, an apparatus for side channel detection is provided, the apparatus comprising: the acquisition module is used for acquiring attribute information of first sensitive data, and the attribute information is marked on the first sensitive data; the constraint detection module is used for determining whether the first sensitive data meets the side channel security constraint based on attribute information of the first sensitive data, wherein the side channel comprises one or more of a branch execution time side channel, a variable time function side channel, a cache time side channel and a variable time instruction side channel; and the recording module is used for recording error information in the compiling log when the first sensitive data does not meet the side channel security constraint.

Optionally, when the constraint detection module determines that the first sensitive data does not meet the branch execution time side channel security constraint, the recording module records the first error information in the compilation log.

Optionally, when the constraint detection module determines that the first sensitive data does not meet the variable time function side channel security constraint, the recording module records the second error information in the compiled log.

Optionally, when the constraint detection module determines that the first sensitive data does not meet the cache time side channel security constraint, the recording module records the third error information in the compiled log.

Optionally, when the constraint detection module determines that the first sensitive data does not meet the variable time instruction side channel security constraint, the recording module records fourth error information in the compilation log.

With reference to the third aspect, in one possible implementation manner, the apparatus further includes: and the registration module is used for marking the attribute information of the first sensitive data in the source code.

Optionally, the registration module may be further configured to: the attribute registration is performed on the first sensitive data before the marking of the attribute information is performed on the first sensitive data.

In the embodiment of the application, the registration module marks the attribute of the sensitive data in the source code and provides an execution basis for the work of the subsequent modules (for example, security constraint detection based on the attribute information, desensitization mechanism judgment based on the attribute information or type reasoning based on the attribute information).

With reference to the third aspect, in one possible implementation manner, the apparatus further includes: a desensitization module for determining whether the first sensitive data satisfies a desensitization condition; the constraint detection module is specifically used for: and when the first sensitive data does not meet the desensitization condition, determining whether the first sensitive data meets side channel security constraints based on attribute information of the first sensitive data.

In the embodiment of the application, before the detection of the side channel security constraint, the desensitization module determines whether the sensitive data meets the desensitization condition, and the detection of the side channel security constraint can be performed only when the sensitive data does not meet the desensitization condition, so that the false report of the side channel risk can be effectively avoided.

With reference to the third aspect, in one possible implementation manner, the apparatus further includes: and the type reasoning module is used for carrying out type reasoning on the first sensitive data.

In the embodiment of the application, the type reasoning module can realize automatic reasoning of the type of the sensitive data by utilizing the transmissibility of the marking attribute of the sensitive data, so that the influence of the sensitive data is transferred in the whole compiling process, the completeness of side channel detection can be improved, and the tool usability in the detection process can be improved.

With reference to the third aspect, in one possible implementation manner, the constraint detection module is specifically configured to: acquiring conditional statement information stored in the compiled intermediate representation; determining whether the first sensitive data exists in the conditional statement information based on attribute information of the first sensitive data; and when the first sensitive data exists in the conditional statement information, determining that the first sensitive data does not meet the branch execution time side channel security constraint.

In this embodiment of the present application, a device for detecting a risk of a branch execution time side channel is provided, where a constraint detection module determines whether a risk of a branch execution time side channel exists in a compiling process based on attribute information of sensitive data, so that the device can accurately and comprehensively detect a branch execution time side channel in a compiling process, and further can quickly and accurately identify a risk of a branch execution time side channel in a compiling process, and further can improve security of sensitive data or a key of a user.

With reference to the third aspect, in one possible implementation manner, the constraint detection module is specifically configured to: acquiring calling information of a calling function in a compiling process, wherein the calling information of the calling function comprises the name of the calling function and a parameter list of the calling function; determining whether the calling function is a variable time function according to the name of the calling function; when the calling function is a variable time function, determining whether the first sensitive data exists in a parameter list of the calling function or not based on attribute information of the first sensitive data; and when the first sensitive data exists in the parameter list of the calling function, determining that the first sensitive data does not meet the channel security constraint of the variable time function side.

In this embodiment of the present application, a device for detecting a risk of a variable time function side channel is provided, where the constraint detection module determines whether a risk of a variable time function side channel exists in a compiling process based on attribute information of sensitive data, so that the device can accurately and comprehensively detect the variable time function side channel in the compiling process, and further can quickly and accurately identify the risk of the variable time function side channel in the compiling process, and further can improve security of user sensitive data or a secret key and the like.

With reference to the third aspect, in one possible implementation manner, the constraint detection module is specifically configured to: obtaining dereferencing statement information stored in the compiled intermediate representation, the dereferencing statement information comprising a pointer calculation address offset expression; determining whether the first sensitive data is pointer offset access data in the pointer calculation address offset expression based on attribute information of the first sensitive data; and when the first sensitive data is determined to be used as pointer offset access data in the pointer calculation address offset expression, determining that the first sensitive data does not meet the buffer time side channel security constraint.

In this embodiment of the application, a device for detecting a risk of replacing a cache time side channel is provided, and a constraint detection module determines whether a risk of the cache time side channel exists in a compiling process based on attribute information of sensitive data, so that the device can accurately and comprehensively detect the cache time side channel in the compiling process, and further can rapidly and accurately identify the risk of the cache time side channel in the compiling process, and further can improve security of sensitive data or keys of users.

With reference to the third aspect, in one possible implementation manner, the constraint detection module is specifically configured to: determining a CPU architecture of a central processing unit when a program is compiled; determining a variable time instruction corresponding to the CPU architecture; determining whether the first sensitive data exists in an operand of the variable time instruction corresponding operation based on attribute information of the first sensitive data; and when the first sensitive data is determined to exist in the operand of the variable time instruction corresponding operation, determining that the first sensitive data does not meet the variable time instruction side channel security constraint.

In this embodiment of the present application, a device for detecting a risk of a variable time instruction side channel is provided, where a constraint detection module determines whether a risk of a variable time instruction side channel exists in a compiling process based on attribute information of sensitive data, so that the device can accurately and comprehensively detect the variable time instruction side channel in the compiling process, and further can quickly and accurately identify the risk of the variable time instruction side channel in the compiling process, and further can improve security of sensitive data or a secret key of a user.

With reference to the third aspect, in one possible implementation manner, the desensitizing module is specifically configured to: constructing a control flow graph consisting of nodes of each branch of the program; when the execution depth of each branch is consistent and each operation of each branch is equivalent, determining that the first sensitive data meets a desensitization condition; or, when the execution depths of the branches are not consistent and/or each step of operation of the branches is not equivalent, determining that the first sensitive data does not meet the desensitization condition.

In the embodiment of the application, a device for determining whether the sensitive data meets the desensitization condition is provided, and when a compiler meets the equivalence condition, a desensitization module determines that the sensitive data meets the desensitization condition, so that a basis is provided for determining whether security constraint detection is needed or not later.

With reference to the third aspect, in one possible implementation manner, the desensitizing module is further specifically configured to: obtaining a first list in a program configuration file, the first list comprising one or more obfuscation operations; determining that the first sensitive data satisfies a desensitization condition when the first sensitive data is processed by a obfuscation operation in the first list; or, when the first sensitive data is not processed by the obfuscation operation in the first list, determining that the first sensitive data does not satisfy a desensitization condition.

In the embodiment of the present application, a device for determining whether the sensitive data meets a desensitization condition is provided, and when the sensitive data is processed by a confusion operation in a program configuration file, it is indicated that the sensitive data is confused with other data and treated as the same, so that the desensitization module determines that the sensitive data meets the desensitization condition, and provides a basis for determining whether security constraint detection is required in the following steps.

With reference to the third aspect, in one possible implementation manner, the type inference module is specifically configured to: determining whether a first variable in a program is sensitive or not according to the type rule of the expression and the subtype inference rule of the variable based on the attribute information of the first sensitive data, wherein the first variable refers to the variable except the first sensitive data in the program; when the first variable is determined to be sensitive, an attribute tag of the first sensitive data is passed to the first variable.

Optionally, if the first variable has an operational relationship with the sensitive data, such as an assignment operation, the type inference module may determine that the first variable is affected by the sensitive data, and further transmit an attribute flag of the sensitive data to the first variable.

In the embodiment of the application, a device for type reasoning is provided, in which a type reasoning module determines whether other variables are sensitive based on attribute information of the sensitive data and the relation between the other variables and the sensitive data, and when the other variables are determined to be influenced by the sensitive data, the attribute of the sensitive data is transmitted to the other variables, so that when security constraint detection is performed, the sensitive data can be detected, and meanwhile, the other variables influenced by the sensitive data can be detected, so that the detection process is more complete, and the security of the sensitive data or keys of users is further improved.

With reference to the third aspect, in one possible implementation manner, the desensitizing module is further configured to: and when the first sensitive data meets the desensitization condition, carrying out desensitization operation on the first sensitive data.

In a fourth aspect, there is provided an apparatus for data processing, the apparatus comprising: the acquisition module is used for acquiring attribute information of the first data; a determining module, configured to determine whether the attribute information exists in an information repository, where the information repository includes at least one of: compiling conditional statement information stored in the intermediate representation, a parameter list of the calling function, and an operand of the corresponding operation of the dereferencing statement information or the variable time instruction stored in the intermediate representation; and the alarm module is used for alarming when the attribute information exists in the information base.

With reference to the fourth aspect, in a possible implementation manner, the apparatus further includes: and the registration module is used for marking the attribute information for the first data in the source code.

With reference to the fourth aspect, in a possible implementation manner, the apparatus further includes a desensitization module, configured to determine whether the first data meets a desensitization condition; the determining module is specifically used for: if the first data does not satisfy the desensitization condition, it is further determined whether the attribute information exists in the information base.

With reference to the fourth aspect, in a possible implementation manner, the apparatus further includes: and the type reasoning module is used for carrying out type reasoning on the first data.

With reference to the fourth aspect, in one possible implementation manner, the determining module is specifically configured to: acquiring conditional statement information stored in the compiled intermediate representation; it is determined whether the attribute information exists in the conditional statement information.

With reference to the fourth aspect, in one possible implementation manner, the determining module is specifically configured to: acquiring calling information of a calling function in a compiling process, wherein the calling information of the calling function comprises the name of the calling function and a parameter list of the calling function; determining whether the calling function is a variable time function according to the name of the calling function; if the calling function is a variable time function, determining whether the attribute information exists in a parameter list of the calling function.

With reference to the fourth aspect, in one possible implementation manner, the determining module is specifically configured to: obtaining dereferencing statement information stored in the compiled intermediate representation, the dereferencing statement information comprising a pointer calculation address offset expression; it is determined whether the attribute information is present in pointer offset access data in the pointer calculation address offset expression.

With reference to the fourth aspect, in one possible implementation manner, the determining module is specifically configured to: determining a CPU architecture of a central processing unit when a program is compiled; determining a variable time instruction corresponding to the CPU architecture; it is determined whether the attribute information is present in an operand of the variable time instruction corresponding operation.

With reference to the fourth aspect, in one possible implementation manner, the desensitizing module is specifically configured to: constructing a control flow graph consisting of nodes of each branch of the program; when the execution depth of each branch is consistent and each operation of each branch is equivalent, determining that the first data meets a desensitization condition; or, when the execution depths of the branches are not consistent and/or each step of operation of the branches is not equivalent, determining that the first data does not meet the desensitization condition.

With reference to the fourth aspect, in one possible implementation manner, the desensitizing module is specifically configured to: obtaining a first list in a program configuration file, the first list comprising one or more obfuscation operations; determining that the first data satisfies a desensitization condition when the first data is processed by a confusion operation in the first list; or, when the first data is not processed by the obfuscation operation in the first list, determining that the first data does not satisfy a desensitization condition.

With reference to the fourth aspect, in one possible implementation manner, the type inference module is specifically configured to: determining whether a first variable in a program is sensitive or not according to the type rule of the expression and the subtype inference rule of the variable based on the attribute information of the first data, wherein the first variable refers to the variable except the first data in the program; when it is determined that the first variable is sensitive, an attribute tag of the first data is passed to the first variable.

In a fifth aspect, an electronic device is provided, the electronic device comprising a memory for storing computer program code and a processor for executing the computer program code stored in the memory for implementing the method of the first aspect or any of the possible implementations of the first aspect or for implementing the method of the second aspect or any of the possible implementations of the second aspect.

In a sixth aspect, there is provided a chip having instructions stored therein which, when run on a device, cause the chip to perform the method of or the method of any of the possible implementations of the first aspect or the second aspect described above.

In a seventh aspect, a computer readable storage medium is provided, in which a computer program or instructions is stored which, when executed, implement the method of the first aspect or any one of the possible implementations of the first aspect or the method of the second aspect or any one of the possible implementations of the second aspect.

In an eighth aspect, a computer program product is provided, comprising computer program code for implementing the method of the first aspect or any of the possible implementations of the first aspect or the method of the second aspect or any of the possible implementations of the second aspect, when the computer program code is run on a computer.

Drawings

FIG. 1 is a flow chart of a prior art side channel detection method;

FIG. 2 is a schematic diagram of another prior art side channel detection method;

fig. 3 is a schematic functional block diagram of a side channel detection device according to an embodiment of the present application;

fig. 4 is a schematic functional block diagram of another side channel detection apparatus according to an embodiment of the present application;

fig. 5 is a schematic diagram of an implementation form of a method for detecting a side channel in software according to an embodiment of the present application;

FIG. 6 is a schematic flow chart diagram of a method for side channel detection provided by an embodiment of the present application;

FIG. 7 is a schematic flow chart diagram of a method of side channel detection provided by an embodiment of the present application;

FIG. 8 is a schematic flow chart diagram of a method for obtaining attributes of sensitive data provided by an embodiment of the present application;

FIG. 9 is a schematic flow chart diagram of a method of security constraint detection based on sensitive data provided by an embodiment of the present application;

FIG. 10 is a schematic flow chart diagram of a method for determining whether there is a branch execution time side channel security risk provided by an embodiment of the present application;

FIG. 11 is a schematic flow chart diagram of a method of determining whether there is a variable time function side channel security risk provided by an embodiment of the present application;

FIG. 12 is a schematic flow chart diagram of a method for determining whether there is a cache time side channel security risk provided by an embodiment of the present application;

FIG. 13 is a schematic flow chart diagram of a method of determining whether there is a variable time instruction side channel security risk provided by an embodiment of the present application;

FIG. 14 is a schematic flow chart diagram of a method of determining whether first sensitive data satisfies a desensitization condition provided by an embodiment of the present application;

FIG. 15 is a schematic flow chart diagram of yet another method of determining whether first sensitive data satisfies a desensitization condition provided by an embodiment of the present application;

FIG. 16 is a schematic flow chart diagram of a method of side channel detection provided by an embodiment of the present application;

fig. 17 is a schematic flow chart of a method for type reasoning based on first sensitive data provided in an embodiment of the present application.

Detailed Description

The technical solutions in the present application will be described below with reference to the accompanying drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application.

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. Wherein, in the description of the embodiments of the present application, "/" means or is meant unless otherwise indicated, for example, a/B may represent a or B; "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. In addition, in the description of the embodiments of the present application, "plural" or "plurality" means two or more than two.

The terms "first" and "second" are used below for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present embodiment, unless otherwise specified, the meaning of "plurality" is two or more.

The method provided by the embodiment of the application can be applied to electronic devices such as mobile phones, tablet computers, wearable devices, vehicle-mounted devices, augmented reality (augmented reality, AR)/Virtual Reality (VR) devices, notebook computers, ultra-mobile personal computer (UMPC), netbooks, personal digital assistants (personal digital assistant, PDA) and the like, and the embodiment of the application does not limit the specific types of the electronic devices.

In the following, some terms involved in the embodiments of the present application will be described in detail.

1. Sensitive data

Sensitive data refers to data that may present serious harm to society or individuals after leakage. Including personal privacy data such as name, identification number, address, telephone, bank account number, mailbox, password, medical information, educational background, etc.; but also data unsuitable for publishing by enterprises or social institutions, such as the business conditions of the enterprises, the network structures of the enterprises, IP address lists and the like. Whereas in cryptographic algorithms sensitive data is generally referred to as a secret key or private key.

2. Side channel attack

Side channel attacks refer to a method for attacking a cryptographic implementation (including a cryptographic chip, a cryptographic module, a cryptographic system, etc.) to finally analyze a key.

3. Static detection technology

Static detection technology refers to scanning a source program or binary code of tested software, understanding the behavior of the program from grammar and semantics, directly analyzing the characteristics of the tested program, and searching for anomalies possibly causing errors.

4. Dynamic detection technology

Dynamic detection techniques refer to analysis of the behavior of a software system, including formal operation of the program in a controlled environment using specific inputs, and comparison with expected results to check whether the system is operating correctly or incorrectly.

5. Key recovery

Key recovery refers to the process by which an attacker guesses a key by means of compromised information, even directly obtaining the key.

6. Branch execution time side channel

The branch execution time side channel refers to the condition that sensitive data influences the branch execution of a program, so that the program can have different execution times when different inputs are input, and an attacker can guess the sensitive data through the execution time difference.

7. Buffering time side channels

The Cache time side channel is also called a Cache time side channel, which means that the sensitive data determines memory data access, and the memory access hit and miss have a larger time difference, so that an attacker can guess the content of the sensitive data through the time difference.

8. Time-variable instruction

The variable-time instruction refers to a case where the central processing unit (central processing unit, CPU) instruction has different operands and different execution times.

9. Compiler

The compiler is a program for translating a high-level computer language which is convenient for people to write, read and maintain into a low-level machine language which can be identified and run by a computer, and the common compiler of the C language is GCC, clang and the like.

10. Compiling plug-in

The compiling plug-in is used for allowing a user to intervene in the compiling process of the compiler through the plug-in, acquiring various data in the compiling process, and even modifying intermediate data generated in the compiling process, so that the aim of modifying the finally generated binary file behaviors is fulfilled.

11. Compiling an intermediate representation

Compiling the intermediate representation is a general code which is irrelevant to programming language and target machine in the compiling process, and as an intermediate representation, the intermediate representation contains the information of functions, execution flow and the like of a program, and a user can customize and optimize some modification of the intermediate representation.

12. Pile inserting during compiling

Compile-time instrumentation refers to modifying existing code or generating new code to insert into the original code during code compilation.

13. Attribute tagging

Attribute tagging refers to attaching attributes to a variable type based on it that can be used for problem detection.

14. Subtype system

A subtype system means that a partial order relation exists between types, namely S < T represents any value type with the type S, and can also be T, namely the element set of S is a subset of the element set of T, and the type S is a subclass of the type T.

15. Plug-in type system

The plug-in type system is a new type which is added with some new types not influencing the original system on the basis of the original type system.

16. Type reasoning

Type reasoning refers to the ability in a programming language to automatically derive value data types at compile time, which is a characteristic of some strongly static type languages. In general, functional programming languages have this property. The ability to automatically infer a type eases many programming tasks, allowing a programmer to ignore type labels while still allowing type checking.

17. Stain analysis

Stain analysis is a technique that tracks and analyzes the flow of stain information in a program. In vulnerability analysis, the data of interest (typically from external inputs to the program) is marked as taint data using taint analysis techniques, and then by tracking the flow of information related to the taint data, it is known whether they can affect some critical program operations, thereby mining program vulnerabilities.

18. Desensitization mechanism

The desensitization mechanism is to confuse sensitive data during the execution of a product or a program, so that an attacker cannot guess original sensitive data content even if a side channel problem exists during calculation or processing. Or constant time implementations, the attacker likewise cannot guess sensitive data, so constant time implementations or obfuscation operations belong to the desensitization mechanism of the data.

19.PASS

PASS refers to the collective term for the operations such as acquiring and processing intermediate representation (intermediate representation, IR) information in the intermediate stage of compilation.

20. Branching condition judgment

The branch condition judgment refers to satisfaction judgment of variables in the program, and when the judgment results are different, the executed branch flow is also different.

21. Pointer offset

The pointer offset refers to that when the memory space is accessed in the program, the content of the pointer address needs to be read, and the pointer is the first address, so when other addresses need to be read, the corresponding address offset needs to be added.

22. Variable time function

A variable time function refers to a function whose execution time is related to the parameter that is entered when the function is called.

23. Program control flow graph

Program control flow diagrams are abstract representations of a process or program, an abstract data structure used in a compiler, maintained internally by the compiler, representing all paths traversed by a program during execution. It graphically represents the possible jump flow direction of all basic blocks in a process, and can reflect the real-time execution of a process.

24.GIMPLE

The GIMPLE is a general code which is irrelevant to programming language and target machine in the compiling process of the GNU compiler set (GNU compiler collection, GCC), and is used as an intermediate representation, which contains information such as functions, execution flow and the like of a program, and a user can customize and optimize some modification of the GIMPLE.

25.gimple_cond

The rule_cond is a key word represented in the middle of the rule, and the condition judgment content of the program is stored.

26.gimple_call

The rule_call is a key word represented in the middle of the rule, and stores information of a program call function.

27.indirect_ref

The index_ref is a key represented in the middle of the pattern, and is stored with an expression of address offset calculation, and the key functions as dereferencing and reads address contents.

28.pointer_plus_expr

pointer_plus_expr is a key in the middle of the pattern, storing a representation of pointer addition, and similarly pointer_sub_expr represents pointer subtraction, both of which are used to calculate the pointer offset.

29.cache hit

cache hit is also known as a cache hit, which refers to a cache hit when a program requests data, where the data is first looked up in the cache, and when the required data can be found.

30.cache miss

cache miss refers to searching data step by step in a cache when a program requests the data, and if the data cannot be found, only the data can be read in a disk, and the data is a cache miss.

31. Floating point number

Floating point numbers are because of resource limitations, and the decimal numbers in mathematics cannot be accurately represented directly in a computer. To better represent it, computer scientists have invented floating point numbers, which are approximations to the decimal representation. Thus, floating point numbers refer to the fraction represented by the four parts of sign, mantissa, radix and exponent, there is some error, and because the representation is complex, the operation speed of the computer is relatively slow.

32. Operand(s)

The operand refers to a participant in instruction execution, and is an object of various operations, namely data needing to participate in the operations.

33. Equivalence operation

An equivalence operation refers to a program that is identical in the type of each operation of different execution branches, so that the execution consumption of the operation is also identical, and such an operation may be referred to as an equivalence operation.

34. Confusion operations

The obfuscation operation is means for hiding specific information, so that an attacker cannot obtain original data, and there is usually a corresponding defrobbing operation to restore the data.

35. Dominant tree algorithm

The dominance tree algorithm is given a directed graph, given a starting point S, an ending point T, requiring which of the all paths from S to T must be traversed (i.e., the intersection of the sets of points on each path), called the dominance point (the principal point), and in short, if a point is deleted (i.e., the dominance point), there is no path for S to reach T. The tree formed by the dominant points is called a dominant tree, and the dominant points are found, and the process of constructing the dominant tree is called a dominant tree algorithm.

36.RSA

RSA is an asymmetric encryption algorithm, which means that the algorithm encrypts and decrypts using different keys, i.e. encryption keys and decryption keys.

37. Signature algorithm

The signature algorithm is a mode of signing information by using a private key in a public-private key pair, and plays roles of anti-repudiation and identification.

Blank Blind signature

The blank blind signature refers to the conversion of the message m before RSA signature, by means of a further parameter r, r having to satisfy a condition with a maximum common factor of 1 with n, m is converted into m 'by m' =mr ζ (mod n).

39. Type rules

The type rules are inference rules that describe how the type system assigns types to syntactic structures. The type system may apply these rules to determine if the type of program is correct and what type the type expression has.

40. Subtype inference rules

The subtype reasoning rule refers to that when the variables carry out operations such as assignment, operation and the like in the expression, the source variable transmits type attributes to the target variable, namely, the process of integrating the subtypes into a larger type is realized, namely, a plurality of source variable types </target variable types.

41. Cryptographic algorithm protocol

The cryptographic algorithm protocol is based on a cryptographic algorithm, operates in a network or a distributed system, and provides a series of steps for each party with security requirements by means of the cryptographic algorithm, so as to achieve the purposes of identity authentication, key distribution, information transmission protection and the like, and the communication flows form the protocol.

42. Data sensitive product

A data-sensitive product refers to a product sensitive to data, which needs to secure data such as personal information of a user.

It should be understood that the technical solution in the embodiment of the present application may be used in Android, IOS, hong meng, and other systems.

The technical scheme of the embodiment of the application can be applied to a side channel detection scene, in particular to a programming device or an electronic device which can be used for programming.

The electronic device in the embodiment of the present application may be a television, a desktop computer, a notebook computer, a portable electronic device, such as a mobile phone, a tablet computer, a camera, a video recorder, or other electronic devices with a storage function, electronic devices in a 5G network, or electronic devices in a public land mobile network (public land mobile network, PLMN) that evolves in the future, which is not limited in this application.

Currently, there are many methods for detecting side channel problems proposed in the industry, but the methods available in the industry are few and have poor usability, and do not cover too many side channel problems.

Fig. 1 shows a flow chart of a prior art side channel detection method 100. As shown in fig. 1, the method 100 includes:

s101: extracting code related to a network protocol from the source code by preprocessing the source code to perform the static stain analysis.

S102: and determining pollution variables in the network protocol through the static stain analysis based on the codes related to the network protocol.

S103: and determining a pollution propagation path of the shared variable by traversing the pollution variable reversely based on the shared variable acquired from the code related to the network protocol.

The pollution propagation path of the shared variable can be used for mining loopholes of the network protocol side channel.

The method adopts static stain analysis and detection, and can realize the function of searching side channel loopholes in a protocol, but because the method uses a pile inserting technology or a code rewriting technology, a modification program often brings huge expenditure to an analysis system, so that the method has the problems of long detection period, incapability of completing analysis, higher false alarm rate and the like for a complex program; in addition, the detection method only analyzes the branch paths, only discovers the problem of the branch execution time side channel, and has a small detection range.

Fig. 2 shows a schematic diagram of another prior art side channel detection method 200. As shown in fig. 2, the method 200 includes:

s201: and calling a rule analysis module to analyze the rule file based on the predefined rule base and the user-defined rule.

S202: and constructing a regular expression based on the parsed rule file.

S203: and performing rule matching on the program.

S204: and generating structural information based on the tested file to obtain a control flow and a data flow of the program.

S205: and carrying out path alias analysis of the program to obtain path alias information of the program.

S206: and carrying out static detection by combining control flow, data flow information and path alias information of the program.

S207: and packaging the static detection result and outputting the static detection result.

The method adopts a rule matching technology, so that the defects of low speed and the like of the existing static detection technology can be overcome to a certain extent, however, as the rule files detected in the method are completely customized by users, and the rule configurations made by different users are different, the rule configurations of the detection problem are easy to miss, the missing report condition occurs, the quality of the detection rule is greatly influenced, and the detection effect is seriously influenced; in addition, the rule in the method is effective on the whole program, and the variable or data is not tracked, so that the false alarm condition can easily occur.

Therefore, most of the existing side channel detection means adopt dynamic detection, and due to strong correlation between detection and coverage rate, the condition of missing report exists, and although a small part of adopted static detection methods exist, the methods have the problems of high cost and small detection range, and the problem of time side channel in software is difficult to be completely found; moreover, the existing side channel detection tool has the problem of usability, and needs to carry out detection configuration, writing test codes and other works, so that the detection workload is large.

Based on the above, the application provides a method and a device for detecting a side channel, the method expands functions based on a native compiler, adopts static side channel detection, combines security constraint judgment and considers the transmission of sensitive data, can improve the efficiency, accuracy and completeness of side channel detection, and further can better improve the security of a program product, so that personal sensitive data of a user can be well protected.

The terminology used in the following embodiments is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification and the appended claims, the singular forms "a," "an," "the," and "the" are intended to include, for example, "one or more" such forms of expression, unless the context clearly indicates to the contrary. It should also be understood that in the various embodiments herein below, "at least one", "one or more" means one, two or more than two. The term "and/or" is used to describe an association relationship of associated objects, meaning that there may be three relationships; for example, a and/or B may represent: a alone, a and B together, and B alone, wherein A, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship.

Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "one embodiment," "some embodiments," "another embodiment," "other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more, but not all, embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.

Fig. 3 is a schematic functional block diagram of an apparatus 300 for side channel detection according to an embodiment of the present application. As shown in fig. 3, the apparatus 300 includes a native compiler 310 and a compilation plug-in 320, wherein the compilation plug-in 320 includes a registration module 321 and an encapsulation module 322, the encapsulation module 322 further including a constraint detection module 3221, a desensitization module 3222, and a type inference module 3223, wherein:

The native compiler 310 is a compiler supporting secondary development (i.e., supporting extension), and may be, for example, a C/c++ native compiler (e.g., GNU compiler set (GNU compiler collection, GCC) compiler, clang compiler, etc.).

The compiling plug-in 320 is used for expanding functions of the native compiler 310.

A registration module 321, configured to mark sensitive data in the source code.

Optionally, the registration module 321 is configured to register the attribute of the side channel and mark the attribute of the side channel on the sensitive data.

The registration module 321 marks the sensitive data, can support the subsequent constraint detection and desensitization mechanism execution for the marked sensitive data, and the marked sensitive data can be used as a source point of type reasoning, and the process of type reasoning can be regarded as a process for transferring the influence of the sensitive data.

An encapsulation module 322 for implementing the overall encapsulation of the constraint detection module 3221, the desensitization module 3222, and the type inference module 3223.

Wherein, alternatively, the encapsulation module 322 may also be referred to as a type detection processing (TypeCheckingPass) module.

Constraint detection module 3221, configured to determine whether the marked sensitive data meets a side channel security constraint condition, and when the marked sensitive data does not meet the side channel security constraint condition, trigger a side channel problem alarm, and record error information in a compiled log; when the marked sensitive data meets the side channel security constraints, then other subsequent steps continue to be performed.

Optionally, the error information includes information of error location, constraint violation details, and the like.

Optionally, when the constraint detection module 3221 determines that the marked sensitive data does not meet the side channel security constraint, subsequent steps are continued after the error information is recorded in the compiled log, such as: when the constraint detection module 3221 determines that the marked sensitive data does not meet the branch execution time side channel security constraint condition, the following steps are continued after the first error information is recorded in the compiled log: determining whether the marked sensitive data meets the channel security constraint condition of the variable time function side, or determining whether the marked sensitive data meets the channel security constraint condition of the buffer time side, or determining whether the marked sensitive data meets the channel security constraint condition of the variable time instruction side, and the like.

A desensitization module 3222 is configured to desensitize the marked sensitive data when the marked sensitive data satisfies a desensitization condition, and in this case, does not detect whether the marked sensitive data satisfies a condition of a security constraint.

Wherein optionally the desensitization condition comprises the marked sensitive data satisfying an equivalence operating condition and/or the marked sensitive data being in a confused state.

A type inference module 3223 is used for performing influence transfer on the marked sensitive data.

In one example, the marked sensitive data is assigned to another variable, and the other variable also carries the attribute mark, that is, the influence of the marked sensitive data is transferred to the other variable, and the other variable is also the object to be judged when the security constraint judgment is performed.

Alternatively, the apparatus 300 may not include the native compiler 310, that is, the apparatus may exist independently of the native compiler.

It should be understood that: the desensitization module 3222 and the type inference module 3223 described above are optional modules of the apparatus 300.

Optionally, the apparatus 300 may further comprise an acquisition module for acquiring attribute information of the marked sensitive data.

Optionally, the apparatus 300 may further include a recording module for recording error information in the compiled log when the side channel risk is detected.

In the embodiment of the application, the function of the original compiler is expanded by utilizing the compiling plug-in, the attribute of sensitive data in the source code can be marked, and on the basis, a plurality of side channels (such as a branch execution time side channel, a variable time function side channel, a cache time side channel, a variable time instruction side channel and the like) in the compiling process can be further accurately and comprehensively detected, so that risks of various side channels can be rapidly and accurately identified, and the problem of risk missing report is solved to a great extent; in addition, through judging the desensitization mechanism, when the sensitive data meets the requirement of the desensitization mechanism, side channel risk detection is not performed, so that less expenditure can be improved, and the problem of risk misreporting is solved to a great extent; in addition, the device also carries out the influence transfer of sensitive data in a type reasoning mode, so that the usability of the device can be obviously improved, and a large amount of workload is saved.

Illustratively, fig. 4 shows a functional block diagram of an apparatus 400 for side channel detection according to an embodiment of the present application. As shown in fig. 4, the apparatus 400 includes a native compiler 410 and a compilation plug-in 420, wherein the compilation plug-in 420 includes a registration module 421 and an encapsulation module 422, the encapsulation module 422 further includes a constraint detection module 4221, a desensitization module 4222, and a type inference module 4223, the constraint detection module 4221 further includes a branch execution time side channel module 42211, a variable time function detection module 42212, a cache time side channel detection module 42213, and a variable time instruction side channel detection module 42214, wherein:

the native compiler 410 is a compiler that supports secondary development (i.e., supports extension), and may be, for example, a C/C++ native compiler (e.g., a GNU compiler set (GNU compiler collection, GCC) compiler, a clang compiler, etc.).

The compiling plug-in 420 is used for expanding functions of the native compiler 410.

A registration module 421 for marking sensitive data in the source code.

Optionally, the registration module 421 is configured to register the attribute of the side channel and mark the attribute of the side channel on the sensitive data.

The registration module 421 marks the sensitive data, can support the subsequent constraint detection and desensitization mechanism execution for the marked sensitive data, and the marked sensitive data can be used as a source point of type reasoning, and the process of type reasoning can be regarded as a process of transferring the influence of the sensitive data.

The encapsulation module 422 is configured to implement the overall encapsulation of the constraint detection module 4221, the desensitization module 4222, and the type inference module 4223.

Wherein, alternatively, the encapsulation module 422 may also be referred to as a type detection processing (TypeCheckingPass) module.

A constraint detection module 4221, configured to determine whether the marked sensitive data meets a side channel security constraint condition, and when the marked sensitive data does not meet the side channel security constraint condition, trigger a side channel problem alarm, and record error information in a compiled log; when the marked sensitive data meets the side channel security constraints, then other subsequent steps continue to be performed.

A branch execution time side channel module 42211, configured to detect whether a side channel of the branch execution time meets a side channel security constraint condition, trigger a side channel problem alarm if the marked sensitive data does not meet the side channel security constraint condition, and record first error information in a compiled log; when the marked sensitive data meets the side channel security constraints, other subsequent steps (e.g., steps performed by variable time function detection module 42212, steps performed by cache time side channel detection module 42213, steps performed by variable time instruction side channel detection module 42214, etc.) are continued.

Optionally, the branch execution time side channel module 42211 detects whether the program meets the side channel security constraint (i.e. whether there is a branch execution time side channel risk) by detecting the impact of the marked sensitive data on the program control flow, which may be as follows:

the branch execution time side channel module 42211 obtains the conditional statement information stored in the compilation intermediate representation, determines that the side channel security constraints are not met when marked sensitive data is present in the conditional statement information (e.g., judgment conditions), further triggers a side channel problem alarm, and records the first error information in the compilation log.

Wherein the first error information may include error location information: the code line of the marked sensitive data in the conditional statement information can further comprise violation constraint details: the branch execution time side channel presents a security risk.

It can be understood that: when marked sensitive data exists in the conditional statement information (e.g., judgment conditions), the execution time of the program may be caused to be variable, and therefore, in this case, it is determined that the side channel security constraint is not satisfied.

The variable time function detection module 42212 is configured to detect whether the variable time function side channel meets the side channel security constraint condition, trigger a side channel problem alarm if the marked sensitive data does not meet the side channel security constraint condition, and record a second error message in the compiled log; when the marked sensitive data meets the side channel security constraints, then other subsequent steps (e.g., steps performed by the cache time side channel detection module 42213, steps performed by the variable time instruction side channel detection module 42214, etc.) continue to be performed.

Alternatively, the variable time function detection module 42212 detects whether the program meets the side channel security constraint condition (i.e. whether there is a variable time function side channel risk) by detecting whether the marked sensitive data is a call parameter of the variable time function, which may be as follows:

the variable time function detection module 42212 obtains function call information in the compiling process, the function call information comprises the name of the call function and a parameter list of the call function, whether the call function is the variable time function is determined according to the name of the call function, if yes, whether marked sensitive data exist in the parameter list of the call function is further determined, if yes, the side channel safety constraint condition is determined not to be met, side channel problem alarm is further triggered, and second error information is recorded in a compiling log.

Wherein the second error information may include error location information: the marked sensitive data is located in the code line of the parameter list of the calling function, and the marked sensitive data can further comprise violation constraint details: the variable time function side channel presents a security risk.

The cache time side channel detection module 42213 is configured to detect whether a cache time side channel meets a side channel security constraint condition, trigger a side channel problem alarm if the marked sensitive data does not meet the side channel security constraint condition, and record third error information in a compiling log; when the marked sensitive data meets the side channel security constraints, then other subsequent steps (e.g., steps performed by the variable time instruction side channel detection module 42214, etc.) continue to be performed.

The cache time side channel detection module 42213 detects whether the program meets the side channel security constraint (i.e. whether there is a cache time side channel risk) by detecting the impact of the marked sensitive data on the data access, which may be as follows:

the cache time side channel detection module 42213 obtains dereferencing statement information stored in the compiled intermediate representation, the dereferencing statement information including a pointer calculation address offset expression, then further determines whether the marked sensitive data is accessed as pointer offsets in the pointer calculation address offset expression, if so, determines that side channel security constraints are not met, further triggers side channel problem alerting, and records third error information in the compiled log.

Wherein the third error information may include error location information: the code line where the marked sensitive data of the pointer offset access data in the pointer calculation address offset expression is located may further include violation constraint details: buffering time side channels presents a security risk.

A variable time command side channel detection module 42214, configured to detect whether a variable time command side channel meets a side channel security constraint condition, trigger a side channel problem alarm if the marked sensitive data does not meet the side channel security constraint condition, and record a fourth error message in a compiling log; when the marked sensitive data meets the side channel security constraints, then other subsequent steps (e.g., steps performed by type inference module 4223, etc.) continue to be performed.

Alternatively, the variable time instruction side channel detection module 42214 detects whether the program meets the side channel security constraint (i.e. whether there is a variable time instruction side channel risk) by detecting whether the marked sensitive data is an operand of the variable time instruction, which may be as follows:

the variable time instruction side channel detection module 42214 determines the CPU architecture at the time of program compiling, then determines the variable time instruction corresponding to the CPU architecture according to the CPU architecture, further determines the operand of the variable time instruction corresponding operation corresponding to the CPU architecture, further determines whether the marked sensitive data appears in the operand of the variable time instruction corresponding operation, if yes, determines that the side channel security constraint condition is not satisfied, further triggers side channel problem alarm, and records fourth error information in the compiling log.

Wherein the fourth error information may include error location information: the code line where the marked sensitive data is located in the operand of the corresponding operation of the variable time instruction can also include violation constraint details: the variable time instruction side channel presents a security risk.

It can be understood that: the CPU architecture at program compilation has a correspondence with the variable-time instructions, that is, the CPU architecture at program compilation is different and the variable-time instructions are also different.

A desensitization module 4222 for desensitizing the marked sensitive data when the marked sensitive data meets a desensitization condition, and in this case not detecting whether the marked sensitive data meets a condition of a security constraint.

A type inference module 4223 for effecting transfer of the marked sensitive data.

Alternatively, the apparatus 400 may not include the native compiler 410, that is, the apparatus 400 may exist independently of the native compiler 410.

In one example, the native compiler 410 is a GCC compiler and the step of the constraint detection module 4221 determining whether side channel security constraints are met may be:

in the middle stage of compiling, firstly determining whether marked sensitive data exists in a conditional expression (rule_cond) contained in a conditional expression key in the GIMPLE middle representation; then, continuously determining whether a function called in a calling (sample_call) keyword is a variable time function and whether a calling parameter is sensitive data; next, determining whether there is sensitive data in the address offset expression in the indirect_ref dereferencing; finally, it is determined whether the operation involves a variable time instruction and the sensitive data is used as an operand. Note that: the sequence of the detection steps is not sequential.

Fig. 5 is a schematic diagram illustrating an implementation form of a method for detecting a side channel in software according to an embodiment of the present application. As shown in fig. 5, this implementation form includes the following steps:

s501: the source code is input to a native compiler.

S502: and adding the compiling plug-in into a native compiler, and deploying the compiling plug-in serving as an expanding plug-in of the native compiler in a compiling flow of the software.

The function of the compiling plug-in is described in detail in the embodiments shown in fig. 3 and 4, and is not described here again for brevity.

S503: plug-in usage configuration is performed in compiling scripts or configurations.

Therefore, the compiling plug-in provided by the embodiment of the application can deeply participate in the compiling process of the software, and can judge whether the program implementation semantics have side channel risks or not, but the compiling plug-in does not influence the original implementation of the program at all, and aims to detect the side channel problems. The implementation form of the compiling plug-in software compiling is shown in the embodiment of the application, and the compiling plug-in is completely independent of a native compiler.

Illustratively, fig. 6 shows a schematic flow chart of a method 600 for side channel detection provided in an embodiment of the present application. As shown in fig. 6, the method 600 includes:

S601: the sensitive data is marked.

In particular, the marking of sensitive data is performed in the source code.

In one example, marking sensitive data in source code may specifically be:

where a is sensitive data.

S602: in the compiling process, side channel detection is performed based on the marked sensitive data.

Optionally, the marked sensitive data is identified (or obtained) during the compilation process.

Optionally, security constraint detection (e.g., processing (PASS) and detection of Intermediate Representation (IR) of code at compile stage) based on marked sensitive data, i.e., branch execution time side channel detection, variable time function side channel detection, cache time side channel detection, and variable time instruction side channel detection, respectively.

Alternatively, when the marked sensitive data satisfies the desensitization condition, the marked sensitive data is temporarily desensitized, and the security constraint detection described above is not performed.

Optionally, type reasoning is performed based on the marked sensitive data, so that the influence transfer of the marked sensitive data is realized.

S603: and outputting a compiling result.

And when the compiling result is that the compiling is successful, the risk of no side channel is indicated.

And when the compiling result is that the compiling fails, the side channel risk is indicated.

Optionally, during compiling, when compiling errors (side channel risks) are detected, the compiling errors are automatically recorded, recorded compiling error information can be output uniformly after the whole program is compiled, then source codes can be modified uniformly according to the output compiling error information, and further compiling process detection is returned.

In the embodiment of the application, a method for detecting a time-side channel problem in a static state is innovatively provided, and the method utilizes the characteristic that a compiling plug-in support endows variables with attributes, and detects the time-side channel problem by marking the attributes of sensitive data, performing type comparison between nodes or judging the position of marked sensitive data in a program in an Intermediate (IR) stage of compiling; meanwhile, a desensitization mechanism and type reasoning of a plug-in type system are realized, and accuracy and completeness of side channel detection can be effectively improved.

Illustratively, fig. 7 shows a schematic flow chart of a method 700 for side channel detection provided by an embodiment of the present application. As shown in fig. 7, the method 700 includes:

s701: attribute information of the first sensitive data is acquired.

Optionally, attribute registration is performed on the first sensitive data in the source code, attribute marking is performed on the first sensitive data, the first sensitive data with the attribute marking is obtained, and attribute information of the first sensitive data is obtained in a compiling stage.

Wherein the first sensitive data may refer to all sensitive data in the source code; may also refer to partially sensitive data in the source code; but may also refer to a particular type of sensitive data in the source code.

Optionally, registering attributes of the sensitive data in the source code by means of C language macro definition; then, the registered attribute can be added into the variable type and variable name middle of the source code, and the variable type and variable name middle of the source code can be used as an additional type or attribute; the compilation plug-in will then acquire the properties of these variables for subsequent security constraint determination at an intermediate stage of the compilation process.

Optionally, after the attribute of the sensitive data is configured through an additional configuration file, the attribute information of the first sensitive data is read at the time of compiling.

S702, determining whether the first sensitive data meets a desensitization condition.

The desensitization condition of the first sensitive data comprises an equivalence condition and a confusion condition, wherein the equivalence condition specifically means that the instruction numbers of all branches of a program are equal and operate in the same type; the confusion condition means that the first sensitive data is subjected to confusion processing.

When the first sensitive data satisfies the desensitization condition, step S703 is performed; when the first sensitive data does not satisfy the desensitization condition, step S704 is performed.

S703: the first sensitive data is temporarily desensitized.

Alternatively, in this case, step S706 may be further performed.

S704: it is determined whether the first sensitive data satisfies a side channel security constraint.

Specifically, security constraint detection is performed based on the first sensitive data, and whether the first sensitive data meets side channel security constraint is determined, namely branch execution time side channel detection, variable time function side channel detection, cache time side channel detection and variable time instruction side channel detection are respectively performed, wherein the order of the 4 detection processes is not sequential. If the side channel security constraint is not satisfied, executing step S705; if the side channel security constraint is satisfied, step S706 is performed.

S705: error information is recorded in the compiled log.

The error information may include information such as error location, constraint violation details, and the like.

S706: based on the first sensitive data, type reasoning is performed.

This step is actually to enable the transfer of the effects of the first sensitive data, for example: the first sensitive data is assigned to the second data, which is then also marked with an attribute of the first sensitive data, i.e. the second data is also regarded as sensitive data.

Optionally, after the execution of this step, the process returns to step S702, and the compilation detection is continued, and step S707 is executed synchronously or asynchronously.

S707: and outputting a compiling result.

In the embodiment of the application, the function of the original compiler is expanded by utilizing the compiling plug-in, the attribute of sensitive data in the source code can be marked, and on the basis, a plurality of side channels (such as a branch execution time side channel, a variable time function side channel, a cache time side channel, a variable time instruction side channel and the like) in the compiling process can be further accurately and comprehensively detected, so that risks of various side channels can be rapidly and accurately identified, and the problem of risk missing report is solved to a great extent; in addition, through judging the desensitization mechanism, when the sensitive data meets the requirement of the desensitization mechanism, side channel risk detection is not performed, so that less expenditure can be improved, and the problem of risk misreporting is solved to a great extent; in addition, the method also carries out the influence transfer of the sensitive data in a type reasoning mode, can obviously improve the usability of the method and saves a large amount of workload.

To more clearly understand the attribute acquisition process of sensitive data, fig. 8 is an exemplary flowchart illustrating a method 800 for acquiring attributes of sensitive data according to an embodiment of the present application. As shown in fig. 8, the method 800 includes:

S801: the attributes of the first sensitive data are registered in the source code.

Optionally, the attribute of the first sensitive data is registered in the source code by means of a C-language macro definition.

S802: and carrying out attribute marking on the first sensitive data in the source code to obtain the first sensitive data with the attribute marking.

Wherein the first sensitive data may refer to all sensitive data in the source code; may also refer to partially sensitive data in the source code; and the attribute of the first sensitive data can be added to the middle of the variable type and the variable name of the source code as an additional type or attribute.

S803: attribute information of the first sensitive data is acquired.

Specifically, in the middle stage of the compiling process, the compiling plug-in acquires attribute information of the first sensitive data for subsequent security constraint judgment.

In the embodiment of the application, attribute marking is performed on the sensitive data, and in the middle stage of the compiling process, the compiling plug-in obtains attribute information of the sensitive data, so that a basis can be provided for subsequent detection of side channel risks.

To more clearly understand the security constraint detection process based on sensitive data, fig. 9 is an exemplary flowchart illustrating a method 900 for security constraint detection based on sensitive data according to an embodiment of the present application. As shown in fig. 9, the method 900 includes:

S901: it is determined whether there is a branch execution time side channel security risk. If yes, go to step S902; if not, step S903 is performed.

S902: the first error information is recorded in the compiled log, and the step S903 is continued to be performed.

Optionally, after correcting the error code according to the first error information, step S903 is continuously performed.

S903: it is determined whether the first sensitive data is a call parameter for the variable time function (i.e., it is determined whether there is a variable time function side channel security risk). If yes, go to step S904; if not, step S905 is performed.

S904: the second error information is recorded in the compiled log, and the step S905 is continued.

Optionally, after correcting the error code according to the second error information, step S905 is continuously performed.

S905: it is determined whether there is a cache time side channel security risk. If yes, go to step S906; if not, step S907 is performed.

S906: the third error information is recorded in the compiling log, and the step S907 is continued.

Optionally, after correcting the error code according to the third error information, step S907 is continued.

S907: it is determined whether the first sensitive data is an operand for a variable time instruction (i.e., it is determined whether there is a variable time instruction side channel security risk). If yes, go to step S908; if not, the compiling is finished, and the compiling result is that the compiling is successful.

S908: fourth error information is recorded in the compilation log.

Optionally, the error code is corrected according to the fourth error information.

Hereinafter, the process of security constraint detection will be described in detail with reference to fig. 10 to 13, exemplarily, on the basis of the embodiment shown in fig. 9.

Illustratively, fig. 10 shows a schematic flow chart of a method 1000 for determining whether there is a branch execution time side channel security risk (which may also be understood as detecting the influence of sensitive data on a program control flow) provided by an embodiment of the present application. As shown in fig. 10, the method 1000 includes:

s1001: and acquiring the conditional statement information.

Specifically, conditional statement information stored in the compiled intermediate representation is acquired.

Optionally, the above conditional statement information is acquired according to loop-out conditions and/or if statement judgment in the source code.

S1002: whether the first sensitive data exists in the acquired conditional statement information or not is determined. If so, execute step S1003; if not, step S1004 is performed.

Specifically, whether the first sensitive data exists in the acquired conditional statement information is determined according to the attribute information of the first sensitive data.

Alternatively, the conditional statement information may include a left variable (lhs) and a right variable (rhs), for example: in the conditional statement information 2>1, 2 is a left variable and 1 is a right variable.

S1003: the first error information is recorded in the compiled log, and the step S1004 is continued.

Wherein the first error information may include error location information: the code line where the first sensitive data is located in the conditional statement information may further include violation constraint details: the branch execution time side channel presents a security risk.

Optionally, code repair is performed according to the first error information, and step S1004 is continuously performed.

It can be understood that: if the first sensitive data exists in the conditional statement information, the execution time of the program may be variable, so when the first sensitive data is detected to be present in the conditional statement information, the first error information needs to be recorded in the compiling log.

S1004: and continuing compiling detection.

Illustratively, fig. 11 shows a schematic flow chart of a method 1100 for determining whether there is a variable time function side channel security risk provided by an embodiment of the present application. As shown in fig. 11, the method 1100 includes:

s1101: and acquiring function call information.

Wherein the function call information includes a name of the call function and a parameter list of the call function.

Specifically, the function call information is acquired during the compiling process.

S1102: and determining whether the calling function is a variable time function according to the name of the calling function, if so, executing step S1103.

S1103: it is determined whether the first sensitive data exists in the parameter list of the calling function, and if so, step S1104 is performed.

Alternatively, if the first sensitive data is entered as a pointer parameter, or the first sensitive data length is entered as a pointer address space length, it may be determined that the first sensitive data is present in the parameter list of the calling function.

S1104: the second error information is recorded in the compiled log, and the step S1105 is continued to be executed.

Wherein the second error information may include error location information: the code line where the first sensitive data is located in the parameter list of the calling function may further include details of violation constraint: the variable time function side channel presents a security risk.

Optionally, code repair is performed according to the first error information, and step S1105 is continuously performed.

S1105: and continuing compiling detection.

Illustratively, fig. 12 shows a schematic flow chart of a method 1200 for determining whether there is a buffer time side channel security risk (which may also be understood as detecting the impact of sensitive data on data access) provided by an embodiment of the present application. As shown in fig. 12, the method 1200 includes:

S1201: dereferencing statement information is acquired, the dereferencing statement information including a pointer calculation address offset expression.

Specifically, dereferencing statement information stored in the compiled intermediate representation is obtained.

S1202: determining whether the first sensitive data is used as the pointer offset of the pointer calculation address offset expression to access the data, if yes, executing step S1203; if not, step S1204 is performed.

Specifically, the pointer offset access data of the pointer calculation address offset expression is determined, and then whether the first sensitive data is used as the pointer offset access data of the pointer calculation address offset expression is determined.

S1203: the third error information is recorded in the compiling log, and the step S1204 is continued to be performed.

Optionally, after correcting the error code according to the third error information, step S1204 is performed continuously.

Wherein the third error information may include error location information: the code line where the first sensitive data of the pointer offset access data in the pointer calculation address offset expression is located may further include violation constraint details: buffering time side channels presents a security risk.

S1204: and continuing compiling detection.

Illustratively, fig. 13 shows a schematic flow chart of a method 1300 for determining whether there is a variable time instruction side channel security risk provided by an embodiment of the present application. As shown in fig. 13, the method 1300 includes:

S1301: the CPU architecture at program compile time is determined.

S1302: and determining a variable time instruction corresponding to the CPU architecture.

The CPU architecture has a correspondence relationship with the variable time instruction, that is, different CPU architectures correspond to different variable time instructions.

S1303: determining whether the first sensitive data is present in the operand of the operation corresponding to the variable time instruction, if yes, executing step S1304; if not, step S1305 is executed.

S1304: the fourth error information is recorded in the compiling log, and the step S1305 is continued to be executed.

Optionally, after correcting the error code according to the fourth error information, step S1305 is continued.

Wherein the fourth error information may include error location information: in the operand of the corresponding operation of the variable time instruction, the code line where the first sensitive data is located may further include details of violation constraint: the variable time instruction side channel presents a security risk.

S1305: and continuing compiling detection.

In order to more clearly understand the process of determining whether the first sensitive data meets the desensitization condition provided in the embodiment of the present application, hereinafter, a method of determining whether the first sensitive data meets the desensitization condition provided in the embodiment of the present application will be described in detail by way of example with reference to fig. 14 and 15.

Illustratively, FIG. 14 shows a schematic flow chart of a method 1400 provided by an embodiment of the present application for determining whether first sensitive data satisfies a desensitization condition. As shown in fig. 14, the method 1400 includes:

s1401: and constructing a control flow graph formed by all branches of the program from the beginning to the end of all nodes.

Specifically, a control flow graph is constructed in which a conditional statement (or conditional statement information) stored in an intermediate representation of a function is used as a starting point, an end point is determined by a dominant tree algorithm, and each branch of a program is composed of nodes from the beginning to the end.

Optionally, the specific construction mode of the control flow graph may be: based on the IR intermediate representation of the function, basic block partitioning conditions are obtained, and then a control flow graph is constructed according to the jump relationship between these basic blocks.

S1402: determining whether the execution depths of the branches of the program are consistent, if so, executing step S1403; if not, go to step S1404.

This step is in other words determining whether the lengths of each path in the control flow graph are equal.

S1403: determining whether each operation of each branch of the program is equivalent, if so, executing step S1405; if not, go to step S1404.

Optionally, it is determined whether each operation of each branch of the program is equivalent by comparing instruction level of basic intra-block operations on each path in the control flow graph.

S1404: and continuing to detect the safety constraint.

Optionally, the security constraint detection may include the above-mentioned branch execution time side channel detection, and may further include one or more of variable time function side channel detection, cache time side channel detection, or variable time instruction side channel detection.

S1405: the first sensitive data in the conditional statement (or conditional statement information) stored in the intermediate representation of the function is temporarily desensitized.

It can be understood that: when the first sensitive data is in a temporary desensitization state, branch execution time side channel detection is not performed.

Illustratively, FIG. 15 shows a schematic flow chart of yet another method 1500 provided by an embodiment of the present application for determining whether first sensitive data satisfies a desensitization condition. As shown in fig. 15, the method 1500 includes:

s1501: the list of obfuscated operations defined in the program configuration file is read.

S1502: determining whether the first sensitive data is processed by the obfuscation operation in the list, if so, executing step S1503; if not, step S1504 is executed.

S1503: and desensitizing the first sensitive data, and continuing to execute subsequent operations.

In particular, the attribute of the first sensitive data is converted into an intermediate special desensitization state in which the first sensitive data, like other unlabeled variables, does not present side channel risks for any operation.

Wherein the subsequent operation does not include the operation in step S1504.

This step S1503 can be understood as: when the first sensitive data is subjected to the desensitization operation, step S1504 (security constraint detection) is skipped, and the subsequent operation is directly performed.

S1504: and carrying out safety constraint detection, and continuing to execute subsequent operations after the safety constraint detection is completed.

Optionally, the security constraint detection may include the variable time function side channel detection described above, and may further include one or more of branch execution time side channel detection, cache time side channel detection, or variable time instruction side channel detection.

Optionally, the method 1500 may further include the steps of:

s1505: a defrobbing operation is performed on the first sensitive data, and after the defrobbing operation is performed, the process returns to step S1504.

Illustratively, fig. 16 shows a schematic flow chart of a method 1600 for side channel detection provided by an embodiment of the present application, based on the embodiments shown in fig. 10-15. As shown in fig. 16, the method 1600 includes:

s1601: determining whether the condition judgment statement meets the equivalence condition, if so, executing step S1604; if not, step S1602 is executed.

Specifically, the case where the condition judgment statement satisfies the equivalence condition may include: the control flow graph of the conditional judgment statement satisfies that each branch depth is equal and the operation is equal.

The explanation about this step is the same as that in the embodiment shown in fig. 14, and is not repeated here for brevity.

S1602: determining whether there is a branch execution time side channel security risk, if so, executing step S1603; if not, step S1604 is performed.

S1603: the first error information is recorded in the compiled log, and step S1604 is further performed.

The explanation about step S1602 and step S1603 is the same as that of step S901 and step S902 in the embodiment shown in fig. 9, and is not repeated here for brevity.

S1604: determining whether the calling function is a confusion operation, if so, temporarily desensitizing (converting to an intermediate special state) the first sensitive data, and further executing step S1607; if not, step S1605 is executed.

The explanation about this step is the same as that in the embodiment shown in fig. 15, and is not repeated here for brevity.

S1605: determining whether the first sensitive data is a call parameter for a variable time function, if so, executing step 1606; if not, step S1607 is executed.

S1606: the second error information is recorded in the compiled log, and step S1607 is further performed.

The explanation about step S1606 and step S1607 is the same as that of step S903 and step S904 in the embodiment shown in fig. 9, and is not repeated here for brevity.

S1607: it is determined whether there is a cache time side channel security risk. If yes, go to step S1608; if not, step S1609 is executed.

S1608: the third error information is recorded in the compiling log, and the step S1609 is continued to be executed.

The explanation of step S1607 and step S1608 is the same as that of step S905 and step S906 in the embodiment shown in fig. 9, and is not repeated here for brevity.

S1609: it is determined whether the first sensitive data is an operand for a variable time instruction (i.e., it is determined whether there is a variable time instruction side channel security risk). If yes, go to step S1610; if not, the compiling is finished, and the compiling result is that the compiling is successful.

S1610: fourth error information is recorded in the compilation log.

Optionally, the error code is corrected according to the fourth error information and detection is continued.

The explanation of step S1609 and step S1610 is the same as that of step S907 and step S908 in the embodiment shown in fig. 9, and is not repeated here for brevity.

In order to more clearly understand the process of type reasoning based on the first sensitive data provided in the embodiment of the present application, fig. 17 is a schematic flowchart of a method 1700 for type reasoning based on the first sensitive data provided in the embodiment of the present application. As shown in fig. 17, the method 1700 includes:

it can be understood that: the method 1700 proceeds throughout the compilation process.

S1701: it is determined whether the type rule (Typing Rules) of the expression and the subtype inference rule (Subtyping Rules) of the variable are satisfied, and if so, step S1702 is performed.

Wherein optionally the expression comprises an assignment statement, an operation statement or a function return value, etc.

Specifically, this step tracks the data flow through the type rules of the expression and the subtype inference rules of the variables and automatically infers additional types of the respective variables, i.e., whether other variables in the automatic tagging program are sensitive or not based on the case of a small number of type tags (first sensitive data) in the program.

S1702: and carrying out attribute transfer.

Specifically, the tag attribute of the first sensitive data is transferred to other variables, such as the expression "b=a", a is the first sensitive data, and the tag attribute of a is transferred to b.

For example: on top of the C language native type system (e.g. Bool, intelger, char …), a custom system (Secret, uncompable, other Tags, …) is added to implement expansion and reasoning of the type of sensitive data, in a specific example:

for the above formula (1), when the type of the left variable e1 is T1, the type of the right variable e2 is T2, and T1 is a subtype of T2, the result after the binocular operation is T2 type;

similarly, for equation (2) above, when the return value type of the function is T, the variable type assigned by calling the function is also passed as type T.

The present embodiments also provide a computer readable medium storing program code which, when run on a computer, causes the computer to perform any of the methods of fig. 6-17 described above.

The embodiment of the application also provides a chip, which comprises: at least one processor and a memory coupled to the memory for reading and executing instructions in the memory to perform any of the methods of fig. 6-17 described above.

The embodiment of the application also provides electronic equipment, which comprises: at least one processor and a memory coupled to the memory for reading and executing instructions in the memory to perform any of the methods of fig. 6-17 described above.

The above embodiments may be used alone or in combination with each other to achieve different technical effects.

One or more of the modules or units described herein may be implemented in software, hardware, or a combination of both. When any of the above modules or units are implemented in software, the software exists in the form of computer program instructions and is stored in a memory, a processor can be used to execute the program instructions and implement the above method flows. The processor may include, but is not limited to, at least one of: a central processing unit (central processing unit, CPU), microprocessor, digital Signal Processor (DSP), microcontroller (microcontroller unit, MCU), or artificial intelligence processor, each of which may include one or more cores for executing software instructions to perform operations or processes. The processor may be built into a SoC (system on a chip) or an application specific integrated circuit (application specificintegrated circuit, ASIC) or may be a separate semiconductor chip. The processor may further include necessary hardware accelerators, such as field programmable gate arrays (field programmable gate array, FPGAs), PLDs (programmable logic devices), or logic circuits implementing dedicated logic operations, in addition to the cores for executing software instructions for operation or processing.

When the modules or units described herein are implemented in hardware, the hardware may be any one or any combination of a CPU, microprocessor, DSP, MCU, artificial intelligence processor, ASIC, soC, FPGA, PLD, application specific digital circuitry, hardware accelerator, or non-integrated discrete device that may run the necessary software or that is independent of the software to perform the above method flows.

When the modules or units described herein are implemented in software, they may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

Those of ordinary skill in the art will appreciate that the elements and method steps of the examples described in connection with the embodiments disclosed herein can be implemented as electronic hardware, or as a combination of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of side channel detection, the method comprising:

acquiring attribute information of first sensitive data, wherein the attribute information is marked on the first sensitive data;

determining whether the first sensitive data meets a side channel security constraint based on attribute information of the first sensitive data, wherein the side channel comprises one or more of a branch execution time side channel, a variable time function side channel, a cache time side channel and a variable time instruction side channel;

and if the first sensitive data does not meet the side channel security constraint, recording error information in a compiling log.

2. The method of claim 1, wherein prior to the acquiring attribute information of the first sensitive data, the method further comprises:

Marking the attribute information of the first sensitive data in source code.

3. The method according to claim 1 or 2, wherein the determining whether the first sensitive data satisfies a side channel security constraint based on attribute information of the first sensitive data comprises:

determining whether the first sensitive data satisfies a desensitization condition;

and if the first sensitive data does not meet the desensitization condition, determining whether the first sensitive data meets side channel security constraints based on attribute information of the first sensitive data.

4. A method according to any one of claims 1 to 3, further comprising:

and performing type reasoning on the first sensitive data.

5. The method according to any one of claims 1 to 4, wherein the determining whether the first sensitive data satisfies a side channel security constraint based on attribute information of the first sensitive data comprises:

acquiring conditional statement information stored in the compiled intermediate representation;

determining whether the first sensitive data exists in the conditional statement information based on attribute information of the first sensitive data;

And if the first sensitive data exists in the conditional statement information, determining that the first sensitive data does not meet the branch execution time side channel security constraint.

6. The method according to any one of claims 1 to 5, wherein the determining whether the first sensitive data satisfies a side channel security constraint based on attribute information of the first sensitive data comprises:

acquiring calling information of a calling function in a compiling process, wherein the calling information of the calling function comprises names of the calling function and parameter lists of the calling function;

determining whether the calling function is a variable time function according to the name of the calling function;

if the calling function is a variable time function, determining whether the first sensitive data exists in a parameter list of the calling function or not based on attribute information of the first sensitive data;

and if the first sensitive data exists in the parameter list of the calling function, determining that the first sensitive data does not meet the variable time function side channel security constraint.

7. The method according to any one of claims 1 to 6, wherein the determining whether the first sensitive data satisfies a side channel security constraint based on attribute information of the first sensitive data comprises:

Obtaining dereferencing statement information stored in a compiled intermediate representation, wherein the dereferencing statement information comprises a pointer calculation address offset expression;

determining whether the first sensitive data is used as pointer offset access data in the pointer calculation address offset expression based on attribute information of the first sensitive data;

and if the first sensitive data is determined to be used as pointer offset access data in the pointer calculation address offset expression, determining that the first sensitive data does not meet the buffer time side channel security constraint.

8. The method according to any one of claims 1 to 7, wherein the determining whether the first sensitive data satisfies a side channel security constraint based on attribute information of the first sensitive data comprises:

determining a CPU architecture of a central processing unit when a program is compiled;

determining a variable time instruction corresponding to the CPU architecture;

determining whether the first sensitive data exists in an operand of the variable time instruction corresponding operation based on attribute information of the first sensitive data;

and if the first sensitive data is determined to exist in the operand of the variable time instruction corresponding operation, determining that the first sensitive data does not meet the variable time instruction side channel security constraint.

9. A method according to claim 3, wherein said determining whether said first sensitive data satisfies a desensitization condition comprises:

constructing a control flow graph consisting of nodes of each branch of the program;

when the execution depths of the branches are consistent and each operation of the branches is equivalent, determining that the first sensitive data meets a desensitization condition; or alternatively, the first and second heat exchangers may be,

and when the execution depths of the branches are not consistent and/or each step of operation of the branches is not equivalent, determining that the first sensitive data does not meet a desensitization condition.

10. The method of claim 3 or 9, wherein the determining whether the first sensitive data satisfies a desensitization condition comprises:

obtaining a first list in a program configuration file, wherein the first list comprises one or more confusion operations;

determining that the first sensitive data satisfies a desensitization condition when the first sensitive data is processed by a obfuscation operation in the first list; or alternatively, the first and second heat exchangers may be,

when the first sensitive data is not processed by the obfuscation operation in the first list, determining that the first sensitive data does not satisfy a desensitization condition.

11. The method of claim 4, wherein said type reasoning for said first sensitive data comprises:

Determining whether a first variable in a program is sensitive or not according to the type rule of an expression and the subtype inference rule of the variable based on the attribute information of the first sensitive data, wherein the first variable refers to a variable except the first sensitive data in the program;

when the first variable is determined to be sensitive, the attribute tag of the first sensitive data is passed to the first variable.

12. The method according to any of claims 1 to 11, wherein the error information comprises error location and/or violation constraint details.

13. An apparatus for side channel detection, the apparatus comprising:

the acquisition module is used for acquiring attribute information of first sensitive data, wherein the attribute information is marked on the first sensitive data;

the constraint detection module is used for determining whether the first sensitive data meets side channel security constraints based on attribute information of the first sensitive data, wherein the side channels comprise one or more of branch execution time side channels, variable time function side channels, cache time side channels and variable time instruction side channels;

and the recording module is used for recording error information in the compiling log when the first sensitive data does not meet the side channel security constraint.

14. The apparatus of claim 13, wherein the apparatus further comprises:

and the registration module is used for marking the attribute information of the first sensitive data in the source code.

15. The apparatus according to claim 13 or 14, characterized in that the apparatus further comprises:

a desensitization module for determining whether the first sensitive data satisfies a desensitization condition;

the constraint detection module is specifically configured to:

and when the first sensitive data does not meet the desensitization condition, determining whether the first sensitive data meets side channel security constraints based on attribute information of the first sensitive data.

16. The apparatus according to any one of claims 13 to 15, further comprising:

and the type reasoning module is used for carrying out type reasoning on the first sensitive data.

17. The apparatus according to any one of claims 13 to 16, wherein the constraint detection module is specifically configured to:

And when the first sensitive data exists in the conditional statement information, determining that the first sensitive data does not meet the branch execution time side channel security constraint.

18. The apparatus according to any one of claims 13 to 17, wherein the constraint detection module is specifically configured to:

when the calling function is a variable time function, determining whether the first sensitive data exists in a parameter list of the calling function or not based on attribute information of the first sensitive data;

and when the first sensitive data exists in the parameter list of the calling function, determining that the first sensitive data does not meet the variable time function side channel security constraint.

19. The apparatus according to any one of claims 13 to 18, wherein the constraint detection module is specifically configured to:

and when the first sensitive data is determined to be used as pointer offset access data in the pointer calculation address offset expression, determining that the first sensitive data does not meet the buffer time side channel security constraint.

20. The apparatus according to any one of claims 13 to 19, wherein the constraint detection module is specifically configured to:

determining a variable time instruction corresponding to the CPU architecture;

when it is determined that the first sensitive data is present in an operand of the variable time instruction corresponding operation, it is determined that the first sensitive data does not satisfy a variable time instruction side channel security constraint.

21. The apparatus of claim 15, wherein the desensitizing module is specifically configured to:

22. The apparatus according to claim 15 or 21, wherein the desensitizing module is further specifically configured to:

23. The apparatus of claim 16, wherein the type inference module is specifically configured to:

24. The apparatus of claim 15, wherein the desensitizing module is further configured to:

and when the first sensitive data meets the desensitization condition, carrying out desensitization operation on the first sensitive data.

25. The apparatus according to any of claims 13 to 24, wherein the error information comprises error location and/or violation constraint details.

26. An electronic device, comprising:

one or more processors;

one or more memories;

and one or more computer programs, wherein the one or more computer programs are stored in the one or more memories, the one or more computer programs comprising instructions, which when executed by the one or more processors, cause the electronic device to perform the method of any of claims 1-12.

27. A computer readable storage medium, characterized in that the storage medium has stored therein a program or instructions which, when executed, implement the method of any one of claims 1 to 12.

28. A chip, comprising:

one or more processors;

one or more memories coupled to the one or more memories for reading and executing instructions in the one or more memories to perform the method of any of claims 1-12.

29. A computer program product comprising computer program code for implementing the method according to any of claims 1 to 12 when said computer program code is run on a computer.