CN116049842A - Access log-based ABAC strategy extraction and optimization method - Google Patents
Access log-based ABAC strategy extraction and optimization method Download PDFInfo
- Publication number
- CN116049842A CN116049842A CN202211153020.3A CN202211153020A CN116049842A CN 116049842 A CN116049842 A CN 116049842A CN 202211153020 A CN202211153020 A CN 202211153020A CN 116049842 A CN116049842 A CN 116049842A
- Authority
- CN
- China
- Prior art keywords
- access
- attribute
- rule
- log
- strategy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000000605 extraction Methods 0.000 title claims abstract description 42
- 238000005457 optimization Methods 0.000 title claims abstract description 29
- 238000013475 authorization Methods 0.000 claims abstract description 30
- 238000005516 engineering process Methods 0.000 claims abstract description 13
- 238000007781 pre-processing Methods 0.000 claims abstract description 10
- 238000007418 data mining Methods 0.000 claims abstract description 5
- 238000000926 separation method Methods 0.000 claims abstract description 4
- 238000005192 partition Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000000638 solvent extraction Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 238000005303 weighing Methods 0.000 claims description 3
- 238000005065 mining Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 241000790646 Cotinis Species 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000012407 engineering method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000013441 quality evaluation Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/604—Tools and structures for managing or administering access control systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
- G06F2216/03—Data mining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2141—Access rights, e.g. capability lists, access control lists, access tables, access matrices
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioethics (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Automation & Control Theory (AREA)
- Data Mining & Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an ABAC strategy extraction and optimization method based on access logs, which comprises the following steps: step one, data preprocessing: formalizing the entity and the relation in the access log by using a set, a relation, a function and the like, and converting all numerical variables into separation variables corresponding to the numerical variables; step two, clustering and dividing access logs: determining all authorized access logs related to the policy rules; dividing the access log set into different clusters by using a data mining technology, so that each cluster containing a plurality of records corresponds to one ABAC rule; step three, rule extraction: by recording features from the slave; the invention uses the clustering dividing technology to determine the initial strategy rule number for the given access log set containing the user access request and the system authorization decision, thereby being capable of reducing the strategy extraction scale; and the attribute conditions such as affirmative type and negative type are supported, so that the policy description is more flexible and convenient, and the interpretability of the policy is enhanced.
Description
Technical Field
The invention relates to the technical field of access control, in particular to an ABAC strategy extraction and optimization method based on an access log.
Background
With the rapid development of emerging computing and information technologies such as edge computing, social networks, blockchains and the like, the traditional access control model cannot meet the functional requirements of fine granularity and actual application scenes; attribute-based Access Control (ABAC) provides a flexible approach to addressing complex, dynamic system authorization requirements; in order to successfully implement the ABAC mechanism, it is important to determine an appropriate authorization policy and build a good ABAC system; for this reason, policy engineering technology based on ABAC and two construction methods of top-down and bottom-up are proposed; compared with a top-down manual processing mode which is time-consuming, laborious and easy to make mistakes, the bottom-up method adopts an automatic or semi-automatic mode to mine policy rules, and a non-ABAC model is migrated to an ABAC system, so that mistakes in cost, time, policy development, management and the like can be reduced, and the method has been widely focused and studied in academia and industry in recent years;
the bottom-up policy engineering method is also called policy mining, which is first proposed by Kuhlmann et al and uses a data mining technology to construct a role set from a given authority allocation relation, namely role mining; although a number of character mining methods are proposed, they are not suitable for extraction of ABAC strategies; for this reason, xu and Stoller start from given access control log (or list) and attribute data set of its entity, and first propose ABAC policy research problem and its mining method; later, researchers have sequentially proposed various strategy engineering methods; however, the existing methods have the following main problems: (1) The ABAC policy rules comprise positive conditions and negative conditions, and the authorization policy can be more flexible and convenient by using the negative attribute conditions; however, existing approaches do not support policy mining with negative conditions; (2) ABAC policy rules should be as compact and accurate as possible, inconsistent or erroneous policy decisions will result in either the original authorized access request being denied or the original unauthorized request being allowed; however, the existing method does not optimize the initially mined strategy, and a large number of redundant and erroneous strategy rules exist.
Disclosure of Invention
The invention aims to provide an ABAC strategy extraction and optimization method based on access logs, which uses a clustering division technology to determine the initial strategy rule number for a given access log set comprising user access requests and system authorization decisions, so that the strategy extraction scale can be reduced; and the attribute conditions such as affirmative type and negative type are supported, so that the policy description is more flexible and convenient, and the interpretability of the policy is enhanced.
The aim of the invention can be achieved by the following technical scheme:
an ABAC strategy extraction and optimization method based on access logs comprises the following steps:
step one, data preprocessing: formalizing the entity and the relation in the access log by using a set, a relation, a function and the like, and converting all numerical variables into separation variables corresponding to the numerical variables;
step two, clustering and dividing access logs: determining all authorized access logs related to the policy rules; dividing the access log set into different clusters by using a data mining technology, so that each cluster containing a plurality of records corresponds to one ABAC rule;
step three, rule extraction: extracting attribute conditions in each rule, namely combinations of different attribute-value pairs, by searching similar modes from the attribute conditions of the record features, namely the access authorization tuples;
step four, policy optimization: the rules extracted from the access log have problems of being too strict or too relaxed compared to the original rules; an extraction rule is considered strict if it contains more and more complex attribute conditions than the original rule; conversely, if a rule contains only some simple attribute conditions, then it is considered relaxed; repeatedly correcting the extracted strategy based on the original access log to further improve the quality of the ABAC strategy
As a further scheme of the invention: in the first step, the specific steps of the data preprocessing stage are as follows:
step 1a: the U, O, S and OP respectively represent a user set or a main body set, an object set, a session set and an operation set in the system;A u ,A o ,A s attributes of a user u, an object o and a session s are respectively represented; e, a represents the set of all entities and all entity attributes in the system, respectively, where e=u U O U S, a=a U ∪A O ∪A S The method comprises the steps of carrying out a first treatment on the surface of the Va represents the set of all possible values of attribute a in the system; f (f) a_e (e, a) represents a valued function of attribute a of entity e;
step 1b: the attribute-value pair expression is represented by a tuple shaped as < a, +.v >, where a is the attribute name and v is the attribute value, += { "=", "+|! "," > "<" } is a set of relational operators and represents a value relation between a and v; for example, < a, =v > means that a can take the value v, called positive attribute expression, abbreviated < a, v >; < a, -! v > represents a value other than a may take v, called a negative attribute expression; for convenience of description, only the first two value relationships are discussed, the AC is used for representing the set of all attribute conditions, and the EAV is used for representing the distribution relationship between all entities and the attribute conditions;
step 1c: in ABAC, session attributes relate to dynamic factors such as time, place, or access control scenarios; when preprocessing, these continuous attribute variables are decomposed into discrete types, and the access time is converted into a working duration or a discontinuous working period, so as to extract an ABAC policy pi of the form < E, a, OP, EAV, AC >.
As a further scheme of the invention: in the second step, the access log set AL is divided by using a clustering technology, and the specific steps are as follows:
step 2a: dividing the access log dataset AL into k different partitions c1, c2, … ck using a partitioning method (PAM) around the center point similar to the k-means algorithm, and randomly selecting k initial center points of the clustered partitions;
step 2b: calculating the center point al of any cluster c i Distance to other non-center points:
wherein ,associate(al i ) Representing the center point al in the cluster i All other records associated;
step 2c: comparing dis (al) i ,associate(al i ) Dis (al) j ,associate(al i )\{al j }∪{al i }) to determine whether to exchange al i And alj, and determining a new center point to satisfy:
step 2d: for different k values, repeatedly running a clustering division algorithm and calculating the accuracy and error rate of the model; the k value which can balance the relation between the accuracy and the complexity of the strategy better is selected and used as the initial strategy rule number.
As a further scheme of the invention: in the third step, extracting attribute conditions in each rule, namely, combinations of different attribute-value pairs; the method comprises the following specific steps:
step 3a: defining valid positive or negative attribute-value pairs; weighing scale<a,v|!v>For rule ρ corresponds to cluster c i If and only if for a given threshold T p Or T n Attribute value v appears at c i The frequency of the log is higher or lower than the frequency of the log which appears in the original log set, and<a,v|!v>added to authorization rule ρ=<AC,op>The attribute condition of (2) is called effective attribute condition and is marked as EAC rho; the set of all rules is denoted P;
step 3b: according to the definition of the effective attribute conditions, for any given cluster ci, an extraction process of the effective attribute conditions is given, as shown in an algorithm 1;
algorithm 1. Effective attribute condition extraction:
as a further scheme of the invention: in the fourth step, the specific steps are as follows:
step 4a: the original access log al= { < rq, d > } is divided into a positive type log and a negative type log:
AL + ={<rq,d>|<rq,d>∈AL∧d=permitted};
AL - ={<rq,d>|<rq,d>∈AL∧d=denied};
AL=AL + ∪AL - .
where < rq, d > represents an authorization (or access) record describing the authorization decision d of the system for the access request rq, the value "authorized" may be referred to as grant access or "secured" as deny access;
step 4b: according to the original log AL + 、AL - And the original extraction strategy pi m The type records of "correct affirmative" (TP), "false affirmative" (FP), "correct negative" (TN) and "false negative" (FN) are respectively determined; wherein:
represent AL for an affirmative log in AL + Authorization decision d made by pi, access request rq of (a) Π (rq) is also allowed access;
FP ∏|AL ={<rq,d>|<rq,d>∈AL - :d ∏ (rq) =admitted }, representing the log AL for negation in AL - Authorization decision d made by pi, access request rq of (a) Π (rq) is allowed access;
TN Π|AL ={<rq,d>|<rq,d>∈AL - :d Π (rq) =identified }, representing the log AL for negation in AL - Authorization decision d made by pi, access request rq of (a) Π (rq) is also access denied;
FN Π|AL ={<rq,d>|<rq,d>∈AL + :d Π (rq) =secured }, representing the authorization decision d made for the access request rq, pi of the grant log al+ in AL Π (rq) but access is denied;
step 4c: taking FN and FP log records as training data sets, and respectively extracting policy mode II FN and ΠFP ;
Step 4d: will be pi-shaped FN 、Π FP And pi (a Chinese character) m Comparing, eliminating redundant attribute conditions from the strict rule or adding missing attribute conditions to the relaxation rule; in each optimization process, pi is selected m Middle and pi FN 、∏ FP Rules with similarity and perform in two ways:
for pi FN Arbitrary rule ρ of i If pi is m There is one and ρ i Similar rule ρ j Then the redundant attribute condition is determined from ρ j Delete in the middle; if pi is m Absence of and p i Similar rules, i.e. ρ i Is a missing rule which is directly added to pi m ;
For pi (n) FP Arbitrary rule ρ of i If pi m There is one and ρ i Similar rule ρ j Then the missing attribute condition is added to ρ j 。
As a further scheme of the invention: in the fourth step, the optimization process is as shown in algorithm 2:
algorithm 2 policy optimization
The invention has the beneficial effects that:
(1) For a given access log set containing user access requests and system authorization decisions, determining the initial strategy rule number by using a cluster division technology, so that the strategy extraction scale can be reduced;
(2) The attribute conditions such as affirmative type and negative type are supported, so that the strategy description is more flexible and convenient, and the interpretability of the strategy is enhanced;
(3) Based on the principles of correctness and conciseness, policy quality evaluation criteria are given, the effectiveness and efficiency of the method are verified on the basis of construction and real data sets, and the extracted policy is higher in quality and has remarkable economic and social benefits.
Drawings
The present invention is further described below with reference to the accompanying drawings for the convenience of understanding by those skilled in the art.
FIG. 1 is a block diagram of an ABAC policy extraction and optimization process based on access logs;
FIG. 2 is a graph showing the comparison of the effects of the extraction strategy of the present invention;
FIG. 3 is a graph comparing the effects of the optimization strategy of the present invention;
Detailed Description
The technical solutions of the present invention will be clearly and completely described in connection with the embodiments, and it is obvious that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1-3, an ABAC policy extraction and optimization method based on access logs includes the following steps:
step one, data preprocessing: formalizing the entity and the relation in the access log by using a set, a relation, a function and the like, and converting all numerical variables into separation variables corresponding to the numerical variables; the specific steps of the data preprocessing stage are as follows:
step 1a: the U, O, S and OP respectively represent a user set or a main body set, an object set, a session set and an operation set in the system; a is that u ,A o ,A s Attributes of a user u, an object o and a session s are respectively represented; e, a represents the set of all entities and all entity attributes in the system, respectively, where e=u U O U S, a=a U ∪A O ∪A S The method comprises the steps of carrying out a first treatment on the surface of the Va represents the set of all possible values of attribute a in the system; f (f) a_e (e, a) represents a valued function of attribute a of entity e;
step 1b: the attribute-value pair expression is represented by a tuple shaped as < a, +.v >, where a is the attribute name and v is the attribute value, += { "=", "+|! "," > "<" } is a set of relational operators and represents a value relation between a and v; for example, < a, =v > means that a can take the value v, called positive attribute expression, abbreviated < a, v >; < a, -! v > represents a value other than a may take v, called a negative attribute expression; for convenience of description, only the first two value relationships are discussed, the AC is used for representing the set of all attribute conditions, and the EAV is used for representing the distribution relationship between all entities and the attribute conditions;
step 1c: in ABAC, session attributes relate to dynamic factors such as time, place, or access control scenarios; when preprocessing is carried out, decomposing the continuous attribute variables into discrete types, and converting the access time into working time or discontinuous working time so as to extract an ABAC strategy pi of the form < E, A, OP, EAV and AC >;
step two, clustering and dividing access logs: determining all authorized access logs related to the policy rules; dividing the access log set into different clusters by using a data mining technology, so that each cluster containing a plurality of records corresponds to one ABAC rule; the access log set AL is divided by using a clustering technology, and the specific steps are as follows: step 2a: dividing the access log dataset AL into k different partitions c1, c2, … ck using a partitioning method (PAM) around the center point similar to the k-means algorithm, and randomly selecting k initial center points of the clustered partitions;
step 2b: calculating the center point al of any cluster c i To other thanDistance of center point:
wherein ,associate(al i ) Representing the center point al in the cluster i All other records associated;
step 2c: comparing dis (al) i ,associate(al i ) Dis (al) j ,associate(al i )\{al j }∪{al i }) to determine whether to exchange al i And alj, and determining a new center point to satisfy:
step 2d: for different k values, repeatedly running a clustering division algorithm and calculating the accuracy and error rate of the model; selecting a k value which can better balance the relation between the accuracy and the complexity of the strategy, and taking the k value as an initial strategy rule number;
step three, rule extraction: extracting attribute conditions in each rule, namely combinations of different attribute-value pairs, by searching similar modes from the attribute conditions of the record features, namely the access authorization tuples; extracting attribute conditions in each rule, namely, combinations of different attribute-value pairs; the method comprises the following specific steps:
step 3a: defining valid positive or negative attribute-value pairs; weighing scale<a,v|!v>For rule ρ corresponds to cluster c i If and only if for a given threshold T p Or T n Attribute value v appears at c i The frequency of the log is higher or lower than the frequency of the log which appears in the original log set, and<a,v|!v>added to authorization rule ρ=<AC,op>Attribute bar of (a)A piece, called effective attribute condition, denoted EAC ρ; the set of all rules is denoted P;
step 3b: according to the definition of the effective attribute conditions, for any given cluster ci, an extraction process of the effective attribute conditions is given, as shown in an algorithm 1;
algorithm 1. Effective attribute condition extraction:
step four, policy optimization: the rules extracted from the access log have problems of being too strict or too relaxed compared to the original rules; an extraction rule is considered strict if it contains more and more complex attribute conditions than the original rule; the method comprises the following specific steps:
step 4a: the original access log al= { < rq, d > } is divided into a positive type log and a negative type log:
AL + ={<rq,d>|<rq,d>∈AL∧d=permitted};
AL - ={<rq,d>|<rq,d>∈AL∧d=denied};
AL=AL + ∪AL - .
where < rq, d > represents an authorization (or access) record describing the authorization decision d of the system for the access request rq, the value "authorized" may be referred to as grant access or "secured" as deny access;
step 4b: according to the original log AL + 、AL - And the original extraction strategy pi m The type records of "correct affirmative" (TP), "false affirmative" (FP), "correct negative" (TN) and "false negative" (FN) are respectively determined; wherein:
represent AL for an affirmative log in AL + Authorization decision d made by pi, access request rq of (a) Π (rq) is also allowed access;
FP Π|AL ={<rq,d>|<rq,d>∈AL - :d Π (rq) =admitted }, representing the log AL for negation in AL - Authorization decision d made by pi, access request rq of (a) Π (rq) is allowed access;
TN ∏|AL ={<rq,d>|<rq,d>∈AL - :d ∏ (rq) =identified }, representing the log AL for negation in AL - Authorization decision d made by pi, access request rq of (a) ∏ (rq) is also access denied;
FN ∏|AL ={<rq,d>|<rq,d>∈AL + :d ∏ (rq) =secured }, representing the authorization decision d made for the access request rq, pi of the permission log al+ in AL ∏ (rq) but access is denied;
step 4c: taking FN and FP log records as training data sets, and respectively extracting policy mode II FN and ΠFP ;
Step 4d: pi (Pi) FN 、Π FP And pi (a Chinese character) m Comparing, eliminating redundant attribute conditions from the strict rule or adding missing attribute conditions to the relaxation rule; in each optimization process, choose pi m Zhongqipi (Chinese character) FN 、Π FP Rules with similarity and perform in two ways:
for pi (n) FN Arbitrary rule ρ of i If pi m There is one and ρ i Similar rule ρ j Then the redundant attribute condition is determined from ρ j Delete in the middle; if pi is m Absence of and p i Similar rules, i.e. ρ i Is a missing rule which is directly added to pi m ;
For pi FP Arbitrary rule ρ of i If pi m There is one and ρ i Similar rule ρ j Then will be missingAttribute condition is added to ρ j The method comprises the steps of carrying out a first treatment on the surface of the Conversely, if a rule contains only some simple attribute conditions, then it is considered relaxed; repeatedly correcting the extracted strategy based on the original access log, so as to further improve the quality of the ABAC strategy; the optimization procedure is as shown in algorithm 2:
algorithm 2 policy optimization
Working principle: the effectiveness and efficiency of the invention are further verified through experimental evaluation;
executing the strategy extraction and optimization method provided by the invention on a plurality of strategy data sets including construction and reality; the constructed access log is derived from a randomly created policy set, including a partial dataset of UniversityP, healthcareP, projectManagementP, universityPN, healthcarePN, projectManagementPN; the authorization rule of the strategy is established in random attribute and attribute value set thereof, and the strategy extraction effect can be evaluated on access logs with different scales and continuously changing structural characteristics; for constructing input data, for each ABAC policy, creating a set of authorized tuples and evaluating the ABAC policy corresponding to each access right; the real data set is derived from the public access log data set provided by Amazon Kagle and Amazon UCI; amazon kagle records the access request of staff member to the resource and whether the staff member is authorized to access the resource, and also describes the attribute characteristic value and the resource identifier of the staff member; the data set contains 12000 users and 7000 object resources in total; amazon UCI contains more than 36000 users, 27000 rights and 33000 attribute features;
the hardware environment for all experiments included: intel i5-7400 cpu,8gb memory and 64-bit Windows10 operating system; the strategy extraction and optimization are realized in a Python 3 software development environment;
using Accuracy (Precision), recall (Recall), accuracy (Accuracy) and F1 values to evaluate how well the extraction strategy matches the original strategy, the calculations are expressed as follows:
the accuracy rate may have errors on the unbalanced data set, and the selection of using the F1 value as an evaluation index is closer to the original strategy; this is because the larger the F1 value, the higher the quality of the extracted policy, the more consistent with the original access log;
weighted structural complexity (Weighted Structural Complexity, WSC) is another method of evaluating policy quality; for a given ABAC strategy, WSC makes a generalized assessment of strategy size, calculated as follows:
WSC(ρ)=WSC(EAC,op)=w 1 ×WSC(EAC u )+w 2 ×WSC(EAC o )+w 3 ×WSC(EAC s )
wherein WSC (EAC e )=|EAC e |,|EAC e I represents the number of attribute tuples, w, contained in the attribute condition of entity e i Representing a certain specified weight; obviously, the smaller the WSC value, the more compact the policy and the better the management;
the method of the present invention is run repeatedly 10 times on different constructs or real data sets; according to different evaluation indexes such as running time, accuracy, F1 value and complexity, the best performance of the result is obtained, and the result is compared with the methods (respectively recorded as Xu-Stoller, cotrini) proposed in the literature of "Mining attribute-based access control policies (Xu Z, stoller S D.IEEE Transactions on Dependable and Secure Computing,2014,12 (5): 533-545.)", "Mining ABAC rules from sparse logs (Cotrini C, weghorn T, basin D.2018IEEE European Symposium on Security and price.IEEE, 2018:31-46.)", and the like, and the results are shown in table 1, wherein "\" indicates that the experimental result is unknown or the effect is not ideal; from the experimental results in table 1, it can be seen that the overall performance (standard and thick portion) of more than half of the data sets of the present invention is better than that of the other two methods, especially the best performance on UniversityP, projectManagementP, amazon Kaggle, amazon UCI strategy set;
table 1 comparison of different strategy extraction methods
Furthermore, fig. 2, 3 show a comparison of the F1 value and the complexity of the three methods, respectively; as can be seen from the experimental results of FIG. 2, the F1 value change trend of the extraction strategy of the invention is similar to that of the cotini method, and both are superior to that of the Xu-Stoller method; from the experimental results in fig. 3, the complexity of the extraction strategy of the invention changes gently, very close to the trend of the Xu-Stoller method, and both are significantly higher than the quality of the extraction strategy of the cotini method.
The preferred embodiments of the invention disclosed above are intended only to assist in the explanation of the invention. The preferred embodiments are not exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. The invention is limited only by the claims and the full scope and equivalents thereof.
Claims (6)
1. An ABAC strategy extraction and optimization method based on access logs is characterized by comprising the following steps:
step one, data preprocessing: formalizing the entity and the relation in the access log by using a set, a relation, a function and the like, and converting all numerical variables into separation variables corresponding to the numerical variables;
step two, clustering and dividing access logs: determining all authorized access logs related to the policy rules; dividing the access log set into different clusters by using a data mining technology, so that each cluster containing a plurality of records corresponds to one ABAC rule;
step three, rule extraction: extracting attribute conditions in each rule, namely combinations of different attribute-value pairs, by searching similar modes from the attribute conditions of the record features, namely the access authorization tuples;
step four, policy optimization: the rules extracted from the access log have problems of being too strict or too relaxed compared to the original rules; an extraction rule is considered strict if it contains more and more complex attribute conditions than the original rule; conversely, if a rule contains only some simple attribute conditions, then it is considered relaxed; and repeatedly correcting the extracted strategy based on the original access log, so as to further improve the quality of the ABAC strategy.
2. The ABAC policy extraction and optimization method based on access logs according to claim 1, wherein in the first step, the specific steps of the data preprocessing stage are as follows:
step 1a: the U, O, S and OP respectively represent a user set or a main body set, an object set, a session set and an operation set in the system; a is that u ,A o ,A s Attributes of a user u, an object o and a session s are respectively represented; e, a represents the set of all entities and all entity attributes in the system, respectively, where e=u U O U S, a=a U ∪A O ∪A S ;V a Representing genus in systemA set of all possible values of property a; f (f) a_e (e, a) represents a valued function of attribute a of entity e;
step 1b: the attribute-value pair expression is represented by a tuple shaped as < a, +.v >, where a is the attribute name and v is the attribute value, += { "=", "+|! "," > "<" } is a set of relational operators and represents a value relation between a and v; for example, < a, =v > means that a can take the value v, called positive attribute expression, abbreviated < a, v >; < a, -! v > represents a value other than a may take v, called a negative attribute expression; for convenience of description, only the first two value relationships are discussed, the AC is used for representing the set of all attribute conditions, and the EAV is used for representing the distribution relationship between all entities and the attribute conditions;
step 1c: in ABAC, session attributes relate to dynamic factors such as time, place, or access control scenarios; when preprocessing, these continuous attribute variables are decomposed into discrete types, and the access time is converted into a working duration or a discontinuous working period, so as to extract an ABAC policy pi of the form < E, a, OP, EAV, AC >.
3. The ABAC policy extraction and optimization method based on the access log according to claim 1, wherein in the second step, the access log set AL is divided by using a clustering technology, and the specific steps are as follows:
step 2a: dividing the access log dataset AL into k different partitions c1, c2, … ck using a partitioning method (PAM) around the center point similar to the k-means algorithm, and randomly selecting k initial center points of the clustered partitions;
step 2b: calculating the center point al of any cluster c i Distance to other non-center points:
wherein ,associate(al i ) Representing the center point al in the cluster i All other records associated;
step 2c: comparing dis (al) i ,associate(al i ) Dis (al) j ,associate(al i )\{al j }∪{al i }) to determine whether to exchange al i And al j And determining a new center point to satisfy:
step 2d: for different k values, repeatedly running a clustering division algorithm and calculating the accuracy and error rate of the model; the k value which can balance the relation between the accuracy and the complexity of the strategy better is selected and used as the initial strategy rule number.
4. The method for extracting and optimizing ABAC policy based on access log according to claim 1, wherein in the third step, the attribute condition in each rule, that is, the combination of different attribute-value pairs is extracted; the method comprises the following specific steps:
step 3a: defining valid positive or negative attribute-value pairs; weighing scale<a,v|!v>For rule ρ corresponds to cluster c i If and only if for a given threshold T p Or T n Attribute value v appears at c i The frequency of the log is higher or lower than the frequency of the log which appears in the original log set, and<a,v|!v>added to authorization rule ρ=<AC,op>The attribute condition of (2) is called effective attribute condition and is marked as EAC rho; the set of all rules is denoted P;
step 3b: according to the definition of the effective attribute conditions, for any given cluster ci, an extraction process of the effective attribute conditions is given, as shown in an algorithm 1;
algorithm 1. Effective attribute condition extraction:
"input: cluster c i Access log set AL, attribute set A, attribute value set V, threshold T p 、T n
Determining and usingEntity sets such as a main body and an object contained in the cluster ci are represented;
determining and using E AL Representing entity sets such as a subject and an object contained in an AL;
for each a in A do
for each v in Va do
then
EAC ρ ←{<a,v>};
end if
then
EAC ρ ←{<a,!v>};
end if
end for
end for
5. the ABAC policy extraction and optimization method based on access logs according to claim 1, wherein in the fourth step, the specific steps are as follows:
step 4a: the original access log al= { < rq, d > } is divided into a positive type log and a negative type log:
AL + ={<rq,d>|<rq,d>∈AL∧d=permitted};
AL - ={<rq,d>|<rq,d>∈AL∧d=denied};
AL=AL + ∪AL - .
where < rq, d > represents an authorization (or access) record describing the authorization decision d of the system for the access request rq, the value "authorized" may be referred to as grant access or "secured" as deny access;
step 4b: according to the original log AL + 、AL - And the original extraction strategy pi m The type records of "correct affirmative" (TP), "false affirmative" (FP), "correct negative" (TN) and "false negative" (FN) are respectively determined; wherein:
represent AL for an affirmative log in AL + Authorization decision d made by pi, access request rq of (a) Π (rq) is also allowed access;
FP Π|AL ={<rq,d>|<rq,d>∈AL - :d Π (rq) =admitted }, representing the log AL for negation in AL - Authorization decision d made by pi, access request rq of (a) Π (rq) is allowed access;
TN Π|AL ={<rq,d>|<rq,d>∈AL - :d Π (rq) =identified }, representing the log AL for negation in AL - Authorization decision d made by pi, access request rq of (a) ∏ (rq) is also access denied;
FN Π|AL ={<rq,d>|<rq,d>∈AL + :d Π (rq) =identified }, representingAuthorization decision d for access request rq, pi in AL that allows Xu Xing log AL + ∏ (rq) but access is denied;
step 4c: taking FN and FP log records as training data sets, and respectively extracting strategy patterns pi FN and ∏FP ;
Step 4d: pi (Pi) FN 、Π FP And pi (a Chinese character) m Comparing, eliminating redundant attribute conditions from the strict rule or adding missing attribute conditions to the relaxation rule; in each optimization process, choose pi m Zhongqipi (Chinese character) FN 、Π FP Rules with similarity and perform in two ways:
for pi FN Arbitrary rule ρ of i If pi m There is one and ρ i Similar rule ρ j Then the redundant attribute condition is determined from ρ j Delete in the middle; if pi (n) m Absence of and p i Similar rules, i.e. ρ i Is a missing rule which is directly added to pi m ;
For pi (n) FP Arbitrary rule ρ of i If pi m There is one and ρ i Similar rule ρ j Then the missing attribute condition is added to ρ j 。
6. The ABAC strategy extraction and optimization method based on the access log according to claim 1, wherein in the fourth step, the optimization process is as follows in algorithm 2:
algorithm 2 policy optimization
"input: initial policy n m Policy mode II FN and ΠFP Attribute condition
EAC set
Output optimization strategy pi m ’
Initializing II m ’=Π m ;
for eachρ i in∏ FN .P do
for eachρ j in∏ m ’.P do
ifρj is similar toρi then
else
end if
end for
end for
for eachρ i inΠ FP .P do
for eachρ j inΠ m ’.P do
ifρj is similar toρi then
end if
end for
end for.”。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211153020.3A CN116049842A (en) | 2022-09-21 | 2022-09-21 | Access log-based ABAC strategy extraction and optimization method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211153020.3A CN116049842A (en) | 2022-09-21 | 2022-09-21 | Access log-based ABAC strategy extraction and optimization method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116049842A true CN116049842A (en) | 2023-05-02 |
Family
ID=86114197
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211153020.3A Pending CN116049842A (en) | 2022-09-21 | 2022-09-21 | Access log-based ABAC strategy extraction and optimization method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116049842A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117436063A (en) * | 2023-09-22 | 2024-01-23 | 中国人民解放军战略支援部队信息工程大学 | Hierarchical clustering and relation extraction-based ABAC strategy generation method and system |
-
2022
- 2022-09-21 CN CN202211153020.3A patent/CN116049842A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117436063A (en) * | 2023-09-22 | 2024-01-23 | 中国人民解放军战略支援部队信息工程大学 | Hierarchical clustering and relation extraction-based ABAC strategy generation method and system |
CN117436063B (en) * | 2023-09-22 | 2024-05-31 | 中国人民解放军战略支援部队信息工程大学 | Hierarchical clustering and relation extraction-based ABAC strategy generation method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ni et al. | DP-MCDBSCAN: Differential privacy preserving multi-core DBSCAN clustering for network user data | |
Karimi et al. | An automatic attribute-based access control policy extraction from access logs | |
Zhu et al. | Differential privacy and applications | |
CN110555316B (en) | Privacy protection table data sharing method based on cluster anonymity | |
US20190303371A1 (en) | Methods and systems for improved entity recognition and insights | |
Karimi et al. | An unsupervised learning based approach for mining attribute based access control policies | |
CN109117669B (en) | Privacy protection method and system for MapReduce similar connection query | |
Kobel et al. | On the complexity of computing with planar algebraic curves | |
Christen et al. | Pattern-mining based cryptanalysis of Bloom filters for privacy-preserving record linkage | |
CN112101452B (en) | Access right control method and device | |
Cappelletti et al. | On the quality of classification models for inferring ABAC policies from access logs | |
CN110619231A (en) | Differential discernability k prototype clustering method based on MapReduce | |
US11748461B1 (en) | Apparatus and method for vetting a user using a computing device | |
Xiong et al. | Frequent itemsets mining with differential privacy over large-scale data | |
CN117521117A (en) | Medical data application security and privacy protection method and system | |
CN116049842A (en) | Access log-based ABAC strategy extraction and optimization method | |
Chi et al. | Privacy preserving record linkage in the presence of missing values | |
CN116628360A (en) | Social network histogram issuing method and device based on differential privacy | |
CN116186757A (en) | Method for publishing condition feature selection differential privacy data with enhanced utility | |
Wang et al. | Research on the evaluation index system of college students’ class teaching quality based on association algorithm | |
Yu et al. | A novel three-way clustering algorithm for mixed-type data | |
Liu et al. | An improved ID3 algorithm based on variable precision neighborhood rough sets | |
Harb et al. | Selecting optimal subset of features for intrusion detection systems | |
Shang et al. | ABAC policy mining method based on hierarchical clustering and relationship extraction | |
CN111222164B (en) | Privacy protection method for issuing alliance chain data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |