NL2028096B1

NL2028096B1 - Method for excavating tiny non-reduction association rule based on item subset case tree

Info

Publication number: NL2028096B1
Application number: NL2028096A
Authority: NL
Inventors: Pei Zheng; Li Bo; Zhou Bin; Kong Mingming
Original assignee: Univ Xihua
Priority date: 2021-04-29
Filing date: 2021-04-29
Publication date: 2021-06-29

Abstract

The invention discloses a method for excavating a tiny non—reduction association rule based on an item subset case tree, comprising the following steps: utilizing a closed 5 item set generated by an individual item and a union operation of a set to generate an item subset in a case item database, wherein the set is a true subset of a power set of an item set; utilizing the generated item subset to construct an item subset case tree structure of the case item database; excavating a closed frequent item set and a tiny generator thereof in the item subset case tree, and generating rapidly the tiny non— 10 reduction association rule according to the excavated closed frequent item set and the tiny generator thereof. The invention utilizes the closed item set generated by the individual item, obtains a plurality of the item subsets, constructs the item subset case tree to depict the hierarchical relationship of the item subset and the corresponding support degree thereof, thereby effectively reducing the number of the search between 15 the case and the item while effectively reducing a storage space and improving an excavating speed and efficiency of the tiny non—reduction association rule. (Fig. 8)

Description

1 AO 21.04.1094 NL Method for excavating tiny non-reduction association rule based on item subset case tree

TECHNICAL FIELD OF THE INVENTION The present invention relates to the field of data excavation and knowledge acquisition, and proposes a method for excavating a tiny non-reduction association rule based on an item subset case tree from a large case item database, thereby obtaining a non-redundant knowledge library of the large case item database.

TECHNICAL BACKGROUND OF THE INVENTION In a large case item database, an association rule describes a simultaneous occurrence relationship between items, that is, a plurality of cases in a large case item database meets certain items at the same time, wherein, a part of the items is taken as a former piece while the remaining items are taken as latter pieces to constitute an association rule between items. For example, in a large supermarket transaction database, each transaction is taken as one case while a commodity involved in the transaction is taken as one item, the excavated association rule depicts the simultaneous occurrence situation of the commodity in the transaction, this knowledge can be used for the placement of the commodity in a supermarket, the management in the number of the purchased commodities and other supermarket commodity management. Theoretically, if the set of the case meeting a subset of an item is not an empty set, the subset of the item can be used for excavating the association rule. Therefore, on the one hand, the excavating association rule is completed in the power set of the set of the item, this problem is an NP-difficult problem in computer science. On the other hand, because the association rule describes reasonable, scientific and useful knowledge in the large case item database, association rule excavation has been widely used in computer science, management science, economics, social science and other fields, for obtaining the reasonable, scientific and useful knowledge of the corresponding database. The association rule that are usually excavated are very much, which is beyond the scope that people can understand, therefore, combining with the practical application, people have proposed a variety of extended or improved methods for excavating the

2 AO 21.04.1094 NL association rule, in general, these methods for excavating the association rules comprise the following two main contents:

1. generating a frequent item set or a closed frequent item set.

2. excavating various association rules from the frequent item set or the closed frequent item. In practical applications, on the one hand, many generated frequent item sets or closed frequent item sets are generated, therefore, people also propose a huge frequent item set, a generalized item set, a free item set, a disjunction free item set, and so on to restrict the generation of the number of the item sets of the association rule or the association rule of a special need; on the other hand, the association rule excavated from the frequent item set or the closed frequent item set has redundant information, Therefore, people also propose a tiny-huge association rule, an irreducible association rule, a tiny non-reduction association rule, a weighting association rule and so on, to restrict the form of the association rule and reduce the generation of redundant association rule. From the method of generating the association rule, the method in the prior art can be divided into two categories, the first category is the method for excavating the association rule derived from Apriori method, Apriori method is the method for the association rule first proposed, The core idea thereof is to construct Apriori generation function and add successively a subset of an item-generation item according to the size of the support degree of each item, the generated item subset is stored in a hash-tree structure, the association item subset is quickly excavated via the hash-tree structure and taken as the former piece and the latter piece, thereby quickly generating the association rule. Subsequently, Apriori method was extensively extended or improved. The second category is the method derived from a frequent-pattern (FP) tree. Unlike the hash-tree structure of Apriori method, the FP-tree is a subset representation way of the relevant frequent item subset, each branch of the FP-tree stores a subset of the frequent item in a descending order, if the FP-tree is constructed, each item is first arranged from the large to the small, and then the case set and the item set are traversed, respectively, the frequent item subset of which the support degree is from the large to the small can be constructed from large to small layer by layer, the FP- tree 1s utilized to be able to quickly generate the association rule. Subsequently, the FP- tree method is extensively extended or improved.

3 AO 21.04.1094 NL It can be seen that the common feature of the method for excavating the association rule is that a single item generates a subset of the frequent item in a successively increasing way, during a generation process, each item is sequenced from the large to the small based on the support degree thereof, therefore, the frequent item subset is generated in the descending order based on the support degree. The hash-tree storage structure starts to increase from the single item, the case set and the item set are needed to be traversed for a plurality of times to generate the frequent item subset, in the large case item database, the number of calculation and storage space will be increased exponentially. In the FP-tree storage structure, the item is utilized to arrange a list from the large to the small according to the support degree, if the case set and the item set are traversed twice, a branch map in the FP tree where the frequent item subset is arranged from the large to the small based on the support degree can be constructed, Since the frequent item subset is still generated in the successively increasing way of each item, the method derived from the FP-tree still relates to the number of calculations and the storage space during the process of generating the frequent item subset and the corresponding association rule thereof in the large case item database. In general, the frequent item subset is generated based on the size of the support degree of the single item and in the way that the item is added successively, which has the following defects:

1. the successive addition of the item is essential that the single item is traversed and searched in the item set, causing the number of the generated frequent item subset to be very big, in particular, the number of the frequent item subsets in the large case item database expands exponentially, which is not conducive to rapidly excavate the tiny- huge association rule, the tiny non-reduction association rule and so on. In fact, a correlation between the items is provided in the large case item database, the emergence of each item will inevitably lead to the emergence of another item, The way in which the item is added successively does not utilize the correlation between such items.

2. the way of the successive addition of the item has a large amount of calculations during the process of generating the frequent item subset, generates a lot of redundant frequent item subsets, causes the scope of the information including the search of the closed frequent item set, a generator of item subset to be expanded, results in the two problems of the calculation and storage, which is not conducive to the rapid excavation

4 AO 21.04.1094 NL of the association rule. In fact, the relationships between the items in the large case item database can be used to effectively reduce the number of generated redundant frequent item subsets.

SUMMARY OF THE PRESENT INVENTION In order to overcome the shortcomings of the way of the successive addition of an item in an excavation process of the association rule, the invention utilizes a correlation relationship between various items in a large case item database to generate a frequent item subset, provides a construction method of an item subset case tree, and provides a method for quickly excavating a closed frequent item set, a tiny generator, and a tiny non-reduction association rule in an item subset tree. To realize the above purpose, the present invention adopts the following technical solution: A method for excavating a tiny non-reduction association rule based on an item subset case tree comprises the following steps: generating a closed item set corresponding to each item according to a closed operation between a case and an item in a case item database, wherein the closed item set meets the requirement that the support degree thereof is the same as that of the corresponding item; sequencing the generated closed item set from the large to the small according to the number of elements in the set to generate each item subset via a union operation of the set; generating, via an intersection operation of the set, a case set (the support degree of an item subset) of which each item subset is met, and constructing an item subset case tree structure in a generation order; excavating the closed frequent item set and the tiny generator thereof in the item subset case tree and further generating the tiny non-reduction association rule. Specifically, let a case item database be D=(U A), where U={u,uz,...,u,} is a case set, A={a1,42, Gm} is an item set, each case u(i=1,2,...,n) is an item subset, for example, ui={a1,42,63} is one subset of A, which indicates that 41 meets items 41,02 and a3 .The invention uses two following mapping to describe two operations between the case and the item: for any g;e A, j=1.2,....m, T(a)=[u|Vuie U and gic u;)

AO 21.04.1094 NL Intuitively, Tay represents a case subset which all cases meeting item a; constitutes, therefore, in the case item database, the support degree of item a; is the number of the elements of Ta ;), that is, sup(a)=| Ta. naturally, for any item subset A,CA, Tiap=1 Vue U and Agi) 5 Intuitively, Tan represents a case subset which the case meeting each item in A; at the same time constitutes, therefore, the support degree of the item subset Ay is the number of the elements of Tia, that is | Tal.

For any case subset U; CU, the item subset met by U; is as follows:

M \ EU Vup= EV 4, Based the above mapping representation, the method for excavating the tiny non- reduction association rule based on the item subset case tree of the present invention is specifically described in the following:

1. Generating a closed item set corresponding to each item for any item a;€ A, the above two mappings T and 7 are used and the closed item set generated by item q; is as follows: ae; YT 1e Clap="(Han=""" ui according to the representation of mappings Tand 7 Ta ;) 1s all of the subsets of the case meeting item «;, while the item subset met by Tap is 7 ( Tia), therefore, the case subset met by the item subset C(a;) and the case subset met by item a; are the same, that is, the support degree of C(a;) is the support degree of item a; Many sound natures about mappings T and 7 have been obtained, according to the existing natures, C(a;) is easily proved to be one closed item set. Formally, the closed item set C'(a;) is described to have this correlation together with item a; at the same time, that is, item a; is met by any item in C{a;) while meeting the case, if item a;is provided, other items in C(a;) are also inevitably provided.

2. Building the item subset case tree

6 AO 21.04.1094 NL Different from that the single item is added gradually to generate the frequent item set, the present invention uses the closed item set C(a;) of the single item to generate the item subset, that is, B={ C(a),C(a2), Cm) } is understood to be a generator basis, the item subset is generated by the union operation of the set used by a plurality of elements, for example, C(an)WC(ax)VC(am) generates one item subset, formally, let A” be one generated item subset, then, U ? A= Ca), People have got a lot about the good nature of the closed item set C(a;), According to the existing properties, it is easy to prove that all closed item sets of the case item database shall be included in all the item subsets generated by generator basis B={C(a1),C(a2),..., C(an)}. According to this conclusion, we can first generate all the item subsets by generating the generator basis B, and then excavates the required closed frequent item sets in the generated item subset.

Since each C'(a;) itself is one closed item set, on the one hand, the item subset generated by the generator basis B is different from the item subset generated by adding the single item successively, on the other hand, the item subset generated by the generator basis B is a true subset of a power set of the item set, The number of the item sets generated by the generator basis B is smaller than the number of the item sets generated by adding the single item successively, which means that the range of the excavated closed frequent items is small.

Formally, the case set meeting the item subset A” generated by the generator basis B can be represented as: M ’ Taz UA Tic). The following procedure is used to build the item subset case tree and rapidly generate the above all item subsets and the case set met by the item subset: (1) each node of the item subset case tree is represented as: Ax Ta where A” is one item subset generated by B, and A” is the case set met by Tan. (2) each root node of the item subset case tree is represented as: DxU (3) each sub-node of the root node is represented as:

7 AO 21.04.1094 NL Clayx TCay) where the sub-node is arranged from the left to the right based on the number of the items included by C(&;) from the large to the small, that is, from the left to the right, the first sub-node is the node where the number of the items contained by C(a;) 1s the most, the last sub-node is the node where the number of the items contained by C(a;) is the least, the node is arranged in serial number when the number of the items is the same. (4) The sub-node of each sub-node C(a;)x (Ca) is generated based on the following way: let C(a1)x (Ca, C(a2)x T(C(ax)), vers Clag)X (Clan) be the results arranged based on the requirement of (3), for each sub-node C(a;)x TC), the first sub-node thereof is as follows: (ClapoClax T(Clann FC) If there is C(a))UC(aj1)#A and Cla)DCla) and (Clan T(C(a;))#D, other nodes can be generated based on (Ca) UC(a;,2))X( (Cann 7 Clajs2))), CCGC (A m))X ( (Ca) MN Tc (an) respectively and successively. (5) For any node Ax Ta, it is assumed that there is A’=A"UC(q;), then the first sub- node of Ax Ty is as follows: AUC x Fann Fa) If there are A'UC(g;1)#A and C(a;)ZA' and 7 (AHN 7 (Clay, 2D, other nodes can be generated based on (C(AVUC aX HCA NA T(Caa)), ... (CANUClax (7 (CAN N TC, respectively and successively. (6) When it is need to generate frequent item subsets, it is only need to increase the limit of the minimum support o in each node generation process, that is, for any node A’x 7 (A’), increase the limit: Taza

3. Excavating the closed frequent item set and the tiny generator thereof and generating the tiny non-reduction association rule. For the item subset case tree, each node is constituted by the item subset and the case set met by the item subset, according to the case set, the following equivalence relation

8 AO 21.04.1094 NL =~ is defined based on the node of the item subset case tree: for any two nodes AX Tian and Ax T(A”), Ax TAN=A"x TA”) when and only when there is (A= Tan, According to the equivalence relation =, the node can be incorporated as: (Ax TA) [A’] is the set which the item subset of all of the nodes being equivalent to the node A’x [27% in the item subset case tree constitutes, that is, the case set met by the item subset in[A]is TA’). For the convenience of description, the invention provides the following agreement: (1) max{A'] is the largest element determined by the inclusion relation in [A"]. (2) minfA’] is the generator set of the largest element in [A]. Based on the above agreement, the closed frequent item set and the tiny generator thereof are as follows: ® max[A’] is the closed frequent item set of which the support degree is | TA’). ® Let A”emin[A’], if the case set of the subset met by the subset of A” is Tian, and the case set of the subset no less than the subset is also Tan, the subset is one tiny generator of the closed frequent item set max[A’], Gmin{A’] is recorded as all of the tiny generators of max[A’] obtained from min[A"]. According to the closed frequent item set and the tiny generator thereof, the tiny non- reduction association rule is generated as follows: @® the tiny non-reduction association rule with confidence of 1. For any equivalence category [A], let A1 Gmin[A'}, then Aj—(max[A']-A)) is the tiny non-reduction association rule,the support degree thereof is sup(A;—( max[A’]-A1))= Zan, while the confidence thereof is conf(A;—( max{A’]- A= Fam TA)E1. ® the tiny non-reduction association rule with the confidence of B.

9 AO 21.04.1094 NL For any equivalence category [A’] and the father node equivalence category thereof [A”]. that is, in the item subset case tree, A” is the father node of A” and Tame Tan, let A1e Gmin[A”]. then A1—>( max{[A"]-A1) is the tiny non-reduction association rule, the support degree thereof is sup(A1—( max[A'}-Al))=| Tan, There is confidence B=conf(A1—>( max{A’]-A1))=]| Tam T(A1 KI. Compared with the prior art, the invention has the following beneficial effects: the present invention is a method for excavating a tiny non-reduction association rule based on an item subset case tree and utilizes a closes item set of a single item to generate a item subset case tree, compared with a method for enumerating the single item to generate an item subset, the present invention generates less item subsets, thereby effectively avoiding the generation of a redundant item subset. Meanwhile, the searched closed frequent item set and the tiny generator thereof are limited in the item subset case tree, thereby effectively reducing the search scope of the closed frequent item set and the tiny generator thereof. In addition, by utilizing the equivalence category and the hierarchical relation in the item subset case tree, the tiny non-reduction association rule is quickly excavated, thereby effectively avoiding the repeated calculation between the item set and the case set.

BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 is a corresponding closed item set for calculating each item in one embodiment of the present invention; Figure 2 is a case tree for generating an item subset in one embodiment of the present invention; Figure 3 is a case tree of a specific generated item subset in one embodiment of the present invention; Figure 4 is an IT-tree generated in CHARM-L algorithm in one embodiment of the present invention; Figure 5 1s a method for excavating a tiny non-reduction association rule in one embodiment of the present invention;

10 AO 21.04.1094 NL Figure 6 is the running time curve of the algorithm proposed in the present invention and Aprior algorithm; Figure 7 is a used memory curve of the algorithm proposed in the present invention and Aprior algorithm; Figure 8 is a flow chart of an algorithm of the present invention; Table 1 shows a case item database of six cases and five items; Table 2 shows a closed item set and the support degree thereof; Table 3 shows a node, a closed item set and a tiny generator thereof which are incorporated in a case subset case tree shown in Figure 3; Table 4 shows a tiny non-reduction association rule with a confidence threshold of 0.9; Table 5 shows the run time and used memory in Embodiment 2.

EMBODIMENTS OF THE PRESENT INVENITON The present invention is described in further detail with reference to embodiments. It should not be understood that the scope of the above subject matter of the present invention is limited to the following embodiments, and that the techniques carried out based on the present invention are within the scope of the present invention. Embodiment 1 Figure 1 is a method for excavating a tiny non-reduction association rule based on an item subset case tree according to one embodiment of the present invention, Figure 1 aims at calculating and obtaining a closed item set corresponding to each item and comprises the following steps: providing an example and a table of a case item database D=(U,A) of six cases and five items, and providing a case set meeting each item and an item set met by the case set in Embodiment 1 for calculating the closed item set corresponding to each item; Specifically, Table 1 describes the case item database D=(U.A) of six cases and five items, combined with Table 1, the case set meeting each item and the item set met by the case set are as follows: Ta)={iVie URaeil,

A 7 Tia)= ier(a;) i

11 AO 21.04.1094 NL wherein i=1,2, 6, j=1,2,3,4,5. Accordingly, a closed item set corresponding to each item is as follows:

Clay= 1 Fan) the support degree thereof is as follows: Sup(C(ap)=| Ta). The closed item set corresponding to each item according to Embodiment 1 is B={C(a),C(a;),C(a3),C(ay),Clas)}. Figure 2 is a method for excavating a tiny non-reduction association rule based on an item subset case tree according to one embodiment of the present invention, Figure 2 aims at generating the item subset case tree based on the closed item set corresponding to each item generated based on Figure 1 and comprises the following steps: Generating a node of layer Ly, that is, a root node Ly: SxU.

Generating a node of layer L; , that is, a sub node of the root node, Li : Clapx (Can), Clax TC), … Clasx T(Clas)) wherein C(a;) is the j-th large closed item set comprising the number of the items.

It is assumed that layer L,.; has been generated, the node of layer L, consists of a sub node of each node in layer L,.;, if node A,X Ta /)of layer Li is provided and A;'= Aj UC(ax) is met, the sub node thereof is generated in the following: AfUC Tan TClamm....A7uCasx An HC (ASC ))X “AN (Ce) AF OCas)x( “(AN “(Clas))) and A/UC(@)#Aand Clana; and CAN T(C(a))#D, i=k+1,....5 are met.

Figure 3 is a method for excavating a tiny non-reduction association rule based on an item subset case tree according to one embodiment of the present invention, Figure 3 aims at excavating the tiny non-reduction association rule based on the item subset case tree generated in Figure 2, and comprises the following steps: Utilizing the equivalence relationship on the following nodes to incorporate the node in the item subset case tree, for any two nodes A’x Tian and Ax Ta), Ax T(A')zA"'x Tan, when and only when there is TA')= Tay accordingly, the same node of the case set can be incorporated as: (Ax Fay

12 AO 21.04.1094 NL wherein the case sets met by the item subset in the item subset equivalence category [A] are all Tan, according to the inclusion relationship of the set, the largest element and the largest generator in [A’] are as follows: max[A"] min[ A] max[A"] is the closed item set generated by A” while the tiny generator of max[A'] is searched in min[A'}, that is, for any A”e min[ A], if the case set of the subset met by the subset of A Lis (27% and the case set of the subset no less than the subset is Toa, the subset is one tiny generator of the closed item set max[A”], Gmin[A’] is recorded as all of the tiny generators obtained from max[A’] in min[A"]. Accordingly, the tiny non-reduction association rule is generated in the following: For any equivalence category [A’], let Ae Gmin[A"], then Ar>{( max[A-Ap) The support degree thereof is sup(A;—>( max[A’]-A;))=l Tan) while the confidence is conf(d;—( max[A’]-4) =| La Tiani=L. For any equivalent category [A] and the father node equivalence category [A”], that is, in the item subset case tree, A” is father node of A’ and there is Taye Tia, let Ae GminfA”], then A1>( max[A’]-A}) The support degree thereof is sup(A;>{( max[A’]-A;)= |] Tan while there is confidence B=conf(A1>{ max[A']-A1)) = TA" Fail. Embodiment 1: One case item database is D=(U,A)={{1,2,3,4,5,6},{41,42,43,a4,05 }), the example is shown in Table 1. According to Table 1 and Figure 1, the case set meeting a; is as follows: Ta)=l Vie U Baieil={1,2,5}, The item subset met by the case set {1,2,5} is as follows:

M 7 Ta= ier(a) i={ay,as}N{aastNnian,aaasasi={a }

13 AO 21.04.1094 NL Accordingly, the closed item set corresponding to item a; is as follows: Clay)= 7 Ta )={a1} The support degree thereof is as follows: Sup(Clan=| Va){{1,25]1F3. Similarly, the closed item set and the support degree thereof corresponding to 42, as, ay and as can be obtained, the results of embodiment 1 are shown in Table 2. As shown in Table 2, the results sequenced according to the number of included items are Clay), Clan), Clay), Clas), C(as), therefore, the sub node of the root node ZxU constitutes layer LZ, and are from the left to the right as follows: Clag)x{5,6}, Clazyx{4,5}, Cla)x{1,2,5}, C(a3)x{2,3,4,5,6}, Clas)x {1,3,5,6}. The sub node of each node of layer L, constitutes layer L», wherein the sub node of C(a4)x{5,6} is as follows: (C(a4)UC(a2)x ({5,61n{4.5}]), (Clas) UC(a1)X ({5.61n{1,2,5}) Clas) Clas) and (C(a4)UC(as}) do not generate the node, for C(a;)@A; is not met, other sub nodes can be generated similarly.

The sub node of each node of layer LZ, constitutes layer Lj, wherein the sub node of (C{as)uC{(a2))x{5} is as follows: (Clana NoClax{SIN{1.2,5}) ((Clag)uCla)roClas)) and (Clay) Cla) JCas)) do not generate the node, for Cla;)zA/ is not met, other sub nodes can be generated similarly.

Figure 3 shows the specific item subset case tree generated in Embodiment 1, wherein azasas represents the item subset {as,a4.as}, 56 represents the case subset {5,6} Figure 4 is the IT-tree generated based on CHARM-L algorithm in Embodiment 1, wherein the representations of the item subset and the case subset both are similar to those in Figure 3. Compared to Figure 4, the number of the layers and of the nodes of the item subset case tree in Figure 3 are both less than those of the T-tree, naturally, the excavated closed frequent item set and the scope of the tiny generator thereof are less than those of the IT-tree, therefore, the tiny non-reduction association rule can be rapidly generated in the item subset case tree.

According to the item subset case tree shown in Figure 3, by equating the case set, the node in the item subset case tree is incorporated, for example, [ala2a3]x5, wherein, {ala2al3}={ala2a3,ala3a5,a2a3a5,ala2a3a5,ala3a4a5,a2a3a4a5,a1a2a3a4a5}

14 AO 21.04.1094 NL max[ala2a3]=ala2a3a4a5, minfala2a3]={ala2a3,ala3a5,a2a3a5,ala2a3a5,ala3a4a5,a2a3a4ad}, Gminfala2a3}={ala2,alad.a2a4,a2a5,ala3a5}.

The generated tiny non-reduction association rule with the confidence of 1 is as follows: ala2—>a3a4a5, alad—a2a3ad, al2ad—ajazas, (25210384, 01038570244 Table 3 shows the node, the closed item set and the tiny generator thereof which are incorporated in the case subset case tree shown in Figure 3; Table 4 shows the tiny non- reduction association rule with a confidence threshold of 0.9. Embodiment 2 Embodiment 2 uses the data set EXTENDED BAKERY Dataset, the data set records a total of 75,000 sales records for the purchase of 40 kinds of breads (numbered 1 to 40) and 10 kinds of drinks (numbered 41 to 50), the excavated attribute association rule is reflected in the relationship between the purchased bread and beverages. for the attribute association rule excavated by using the present invention, the threshold value of the support degree is set to be 0.01, the threshold value of the confidence is set to be 0, a total of 112 attribute association rules are generated, the method of the present invention is compared to classical Aprior algorithm in terms of the number of the attribute association rules (352 pieces), running time and a used memory, wherein the number of the attribute association rules and the content of a former piece and a latter piece of the rule are completely identical, the running time and the used memory are shown in Table 5, in comparative experiment, Embodiment 2 copies and doubles 75,000 pieces of the original data by 7 times, is multiplied by 2, and get 8 sets of data, respectively, the number, the support degree and the confidence of the obtained rules are unchanged while the running time and the used memory are changed. Figure 6 is the running time curve of the algorithm proposed in the present invention and Aprior algorithm. Figure 7 is a used memory curve of the algorithm proposed in the present invention and Aprior algorithm. The 112 attribute association rules generated by this method are all within the attribute association rules (352) generated by the Aprior algorithm, and all rules are Min-Max rules. The described above are only the embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art will be

AO 21.04.1094 NL able to easily think of variations or substitutions within the technical scope of the present invention, which should be covered within the scope of the present invention.

Claims

16 AO 21.04.1094 EN Conclusions

1. A method for excavating a minuscule non-reduction association rule based on an item subset case tree, which includes the following steps: step 1: generating a closed item set corresponding to each item according to a closed operation between a case and an item in a database of a case item, wherein the closed item set satisfies the requirement that its support level is the same as that of the corresponding item; step 2: sequencing the generated closed item set from largest to smallest according to the number of elements in the set to generate each item subset through a merge operation of the set; step 3: generating, through the set merging operation, a case set of which each item subset is satisfied, and constructing a case tree of an item subset based on the generated case set according to a generated sequence; step 4: excavating a closed set of common items and its minuscule generator in the item subset case tree, and using the excavated set of closed common items and its minuscule generator to generate the minuscule non-reduction association rule .

2. Method for excavating the minuscule non-reduction association rule based on the item subset case tree, characterized in that step | includes the following steps: step 1.1: forming, by a case matching an item and an item matching a case, the closed operation between a pair of the case and the item; step 1.2: use the closed operation to be able to generate the item subset usually reached by the case that matches an item, that is, the closed item set determined by the case that matches an item,

3. Method for excavating the minuscule non-reduction association rule based on the item subset case tree, characterized in that step 2 is the following

17 AO 21.04.1094 NL steps includes: step 2.1: ranking the closed item set determined by the case met by each item, from largest to smallest based on the number of items included in it; step 2.2: generate a new item subset for the generated item subset and the selected closed item set via the merge of the set again according to the order of the order.

4. Method for excavating the minuscule non-reduction association rule based on the case tree of the item subset, characterized by step 3: calculating, through an intersection operation of the set, the case set of which each item subset is satisfied, and constructing the case tree structure of the item subset based on the generated order of the case set.

5. Method for excavating the minuscule non-reduction association rule based on the case tree of the item subset, characterized in that step 4 comprises the following steps: step 4.1: selecting the item subset with the same case set in the case tree of the item subset; step 4.2: the largest element in the item subset of the same case set 1s the closed item set according to the inclusive relationship, where the generator is used to obtain the minuscule generator of the closed item set; step 4.3: the minuscule generator is taken as a previous piece, while the closed item set subtracts the minor generator to become a last piece and generate the minuscule non-reduction association rule.