NL1042116A - Association rule mining method based on vector operations - Google Patents

Association rule mining method based on vector operations Download PDF

Info

Publication number
NL1042116A
NL1042116A NL1042116A NL1042116A NL1042116A NL 1042116 A NL1042116 A NL 1042116A NL 1042116 A NL1042116 A NL 1042116A NL 1042116 A NL1042116 A NL 1042116A NL 1042116 A NL1042116 A NL 1042116A
Authority
NL
Netherlands
Prior art keywords
vector
attribute
vectors
association rule
follows
Prior art date
Application number
NL1042116A
Other languages
Dutch (nl)
Other versions
NL1042116B1 (en
Inventor
Zhou Bin
Pei Zheng
Li Bo
Original Assignee
Univ Xihua
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Univ Xihua filed Critical Univ Xihua
Publication of NL1042116A publication Critical patent/NL1042116A/en
Application granted granted Critical
Publication of NL1042116B1 publication Critical patent/NL1042116B1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • G06N5/025Extracting rules from data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases

Abstract

The present invention discloses an association rule mining method based on vector operations, comprising the following steps: defining vector representations of objects and attributes, and committing operation rules of object vectors and attribute vectors, for calculating vector bases on an attribute set; carrying out calculation based on the vector bases to generate vectors on the attribute set; calculating support degree of any vector on the attribute set based on the vectors on the attribute set; setting a support degree threshold of the vector bases, and screening out vectors beyond the support degree threshold condition; based on a preset confidence threshold, mining attribute association rules meeting the condition in the vectors beyond the support degree threshold condition. The association rule mining method based on vector operations of the present invention generates the topology of vectors on the attribute set using the vector bases, which avoids the generation of a power set of a frequent closed item set, thereby avoids the problems of operations in the power set of attribute set and the repeated generation of attribute association rules, and improving the calculation efficiency.

Description

TITLE: Association rule mining method based on vector operations FIELD OF THE INVENTION
The present invention relates to the field of data mining, in particular to an association rule mining method based on vector operations.
BACKGROUND OF THE INVENTION
Association rule mining aims at mining association rules of attributes determined by quantitative relations from a big data database, a typical example of association rule mining is “if 90% of consumers purchase bread and butter, then milk is also purchased”, wherein the “bread and butter” is an antecedent of an association rule, and the milk is a consequent, and the 90% refers to confidence of an association rule. Attribute association rules reflect the useful knowledge of big data scientifically and reasonably and have already been widely applied to the fields of computer science, management science, economics, social science and so on. Support degree and confidence are used as target functions, and the attribute association rule mining can be transformed into a problem about optimization, and mined attribute association rules are optimal solutions satisfying the target functions.
At present, there are many attribute association rule mining methods based on optimization models. In these methods, various optimization methods or intelligent optimization algorithms, such as shafer evidence theory, a directed graph method, a principal component analysis method, an evolutionary computation algorithm, a particle swarm optimization algorithm and a genetic algorithm, are used for mining corresponding attribute association rules from an attribute subset. In the existing attribute association rule mining, minimal generating elements of frequent closed item sets are used for generating a kind of Min-Max association rules, i.e. if A' is a frequent closed item set, and B is one of minimal generating elements of A', then Bhen'-B) is a Min-Max association rule.
It can be known through analysis that the existing attribute association rule mining generally mines attribute association rules meeting conditions from power sets of attribute sets or power sets of frequent closed item sets. In the mining process, related operations are often repeated between objects and attributes, meanwhile relatively complicated power set operations are involved, causing large quantity of closure operator operation on object sets, and the operation efficiency is low.
SUMMARY OF THE INVENTION
For solving the above potential problem, the present invention aims at overcoming the above shortcomings existing in the prior art and providing a mining method for simply and rapidly obtaining attribute association rules.
For achieving the above purpose of the present invention, the technical solution adopted by the present invention is that:
An association rule mining method based on vector operations comprises the following steps: defining vector representations of objects and attributes, and committing operation rules of object vectors and attribute vectors, for calculating vector bases on an attribute set; carrying out calculation based on the vector bases to generate vectors on the attribute set; calculating support degree of any vector on the attribute set based on the vectors on the attribute set; setting a support degree threshold of the vector bases, and screening out vectors beyond the support degree threshold condition; based on a preset confidence threshold, mining attribute association rules meeting the condition in the vectors beyond the support degree threshold condition.
Further, the step of defining vector representations of objects and attributes and committing operation rules of object vectors and attribute vectors includes: defining an information system I, represented as: l=(U, A, f), U representing an object set, A representing an attribute set, wherein U={u1,...,un}, A={a1,...,am}, un representing the n-th element in the object set, and am representing the m-th element in the attribute set; f is referred to as an information function of I, that is, f:UxA->{0,1}, for any (ui, aj)eU*A, if f(ui, aj)=Pij=0, then it is indicated that the i-th object Uj does not have the j-th attribute a-{, if f(Uj, aj)=Pij=1, then it is indicated that the i-th object Uj has the j-th attribute aj; defining A1-»A2 as an attribute association rule, wherein in A1, A2SA and Α1ΠΑ2=0, A1 is referred to as an antecedent, and A2 is referred to as a consequent; defining Ui=(pii,...,pim)ixm, representing that the object u, may be represented as an m-dimension row vector formed by 0 or 1; defining ai = (pl*.....Pa*)ix*, representing attribute a, may be represented as an n-dimension column vector formed by 0 or 1; committing the following vector operation rules, 1 o Ui=Ui, 0 o
Ui=1ixm=(1.....1)ixm, 1 o aj=aj, 0 o aj=1 nx1=(l.-,i)Li, wherein, (1,...,1)ixm represents an m-dimension row vector having all elements of 1, and (1. -»ιλ»κΐ represents an n-dimension column vector having all elements of 1; committing vector operation rules between the attribute and (ui,...,un) as follows: *j ® ("l. ··· > Un) = (Plj ° "l) Λ ··· Λ (Pnj ° O. committing vector operation rules between the attribute Uj and (ai,...,am) as follows: U]. <g> = (pn o a,) a ··· λ o aJ, wherein n, m, i and j are all positive integers.
Further, said calculating vector bases on an attribute set is as follows: defining B(aj) representing that the attribute aj may generate a vector base, B(a j) -ai ®(“l ,’”,U„)=(Plj°Ul)*··· A(P,9°Ur.) . the obtained vector base on the attribute set is as follows: B(A)=(B(aj)|ajeA}, wherein, n and j are both positive integers.
Further, said carrying out calculation based on the vector bases to generate vectors on the attribute set is as follows: the vector T(J') generated by the vector base corresponding to J', represented as: T(J')=VjejB(aj), wherein Jr T(J')index set, all vectors generated by the vector bases corresponding to Jors generated by T(A)={T(J')|J'S{1,2.....m}}, wherein m and j are both positive integers.
Further, said calculating support degree of any vector on the attribute set based on the vectors on the attribute set is as follows: the support degree of any vector T(J')eT(A) is as follows:
S(T(J'))=(p'ij+P'2j+...+P'nj)/n, wherein n and j are both positive integers.
Further, the step of based on the preset confidence threshold, mining attribute association rules meeting the condition in the vectors beyond the support degree threshold condition includes: based on a preset confidence threshold of association rules, mining an attribute association rules greater than the confidence threshold in T(A).
Further, said mining an attribute association rules greater than the confidence threshold is as follows: selecting two vectors in T(A), denoted as T(A1) and T(A2), wherein T(A1) represents a vector on the attribute set determined by the vector base corresponding to all elements of an attribute subset A1, and T(A2) represents a vector on the attribute set determined by the vector base corresponding to all elements of an attribute subset A2. Any one vector of T(Ai) and T(A2) is an antecedent, and the other vector from which the antecedent is subtracted is a consequent, an attribute association rule is generated, i.e., T(A-ι)—»(T(A2)-T(A-i)) or T(A2MT(Ai)-T(A2));
Thus, the confidence of the generated attribute association rules is as follows: C(T(A^(T(A2)-T(A,)))=S(T(A, UA2))/S(T(A,)) or C(T(A2H(T(A1)-T(A2)))=S(T(A1UA2))/S(T(A2)).
Compared with the prior art, the present invention has the advantages that: the association rule mining method based on vector operations of the present invention generates the vector bases on the attribute set by utilizing the committed vector operations by means of vector representations of the objects and attributes, characterizes the most basic correlation of the attributes, and utilizes the vector bases to generate the vectors on the attribute set, avoiding centralized power operations on the attribute set, decreasing the operation times between the objects and attributes, and generates the attribute association rules satisfying the support degree and confidence and greater than set thresholds, avoiding the generation of the power sets of frequent closed item sets and the repeated generation of the attribute association rules, and improving the calculation efficiency.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is an association rule mining method based on vector operations in one embodiment of the present invention.
Figure 2 is a comparison chart about run time for computing the same data of an algorithm in one embodiment of the present invention and an Aprior algorithm in the prior art.
DETAILED DESCRIPTION OF THE EMBODIMENTS
The present invention is further described below in detail in combination with the specific embodiments, but it should be understood that the scope of the above subject of the present invention is not only limited to the following embodiments, any technologies achieved based on the contents of the present invention are within the scope of the present invention.
Figure 1 illustrates an association rule mining method based on vector operations in one embodiment of the present invention, comprising the following steps:
An association rule mining method based on vector operations comprises the following steps: defining vector representations of objects and attributes, and committing operation rules of object vectors and attribute vectors, for calculating vector bases on an attribute set; carrying out calculation based on the vector bases to generate vectors on the attribute set; calculating support degree of any vector on the attribute set based on the vectors on the attribute set; setting a support degree threshold of the vector bases, and screening out vectors beyond the support degree threshold condition; based on a preset confidence threshold, mining the attribute association rules meeting the condition in the vectors beyond the support degree threshold condition.
Specifically, the step of defining vector representations of objects and attributes and committing operation rules of object vectors and attribute vectors includes: defining an information system I, represented as: I =(U, A, f), U representing an object set, A representing an attribute set, wherein U={ui,...,Un}, A=(ai,...,am}, un representing the n-th element in the object set, and am representing the m-th element in the attribute set; f is referred to as an information function of I, that is, f:U*A—►{Ο,Ι}, for any (u,, aj)e U*A, if f(Ui,aj)=Pij=0, then it is indicated that the i-th object Uj does not have the j-th attribute af if f(Ui,aj)=pij==i1, then it is indicated that the i-th object Uj has the j-th attribute af, defining A1—>A2 as an attribute association rule, wherein, A-ι, A2SA and A1 ΠΑ2=0, A1 is referred to as an antecedent, and A2 is referred to as a consequent; defining Ui^pii,...,pim)1xm, representing that the object u, may be represented as an m-dimension row vector formed by 0 or 1; defining ai “ , representing attribute aj may be represented as an n-dimension column vector formed by 0 or 1; committing the following vector operation rules, 1 o u,=u,, 0 o Ui=1ixm =(1,..., 1)ixm, 1 o a^, 0 o a,=1 „*1=0*-'lW, wherein, (1,...,1)i*m represents an m-dimension row vector having all elements of 1, and (l*-*Wïxi represents an n-dimension column vector having all elements of 1; committing vector operation rules between the attribute aj and (ui,...,un) as follows: aj ® (wi> ·” > un) = (Pij ° ui) λ ··· λ {pnj o uB)m y committing vector operation rules between the attribute Uj and (ai.....am) as follows: ui ® {ai>-> a.) = (Pn ° a,) Λ ··· Λ (¾ O a,), wherein n, m, i and j are all positive integers.
Specifically, said calculating vector bases on an attribute set is as follows: defining B(aj) representing that the attribute at may generate a vector base, B(aj) = aj = °«1)α···λ (pnJ °u„) the obtained vector base on the attribute set is as follows: B(A)=(B(aj)|ajeA}, wherein n and j are both positive integers.
Specifically, said carrying out calculation based on the vector bases to generate vectors on the attribute set is as follows: the vector T(J') generated by the vector base corresponding to J', represented as T(J') =Vj^j’B(aj), wherein J' is an index set, all vectors generated by the vector base corresponding to J' are denoted as T(A)={T(J')|J'£{1,2,...,m}}, wherein m and j are both positive integers.
Specifically, said calculating support degree of any vector on the attribute set based on the vectors on the attribute set is as follows: the support degree of any vector T(J')eT(A) is as follows:
S(T(J'))=(p'ij+p'2j+...+p'nj)/n, wherein n and j are both positive integers.
Specifically, the step of based on a preset confidence threshold, mining attribute association rules meeting the condition in the vectors beyond the support degree threshold condition includes: based on a preset confidence threshold of association rules, mining an attribute association rule greater than the confidence threshold in T(A).
Specifically, said mining an attribute association rule greater than the confidence threshold is as follows: selecting two vectors in T(A), denoted as T(A-i) and T(A2), wherein T(A-\) represents a vector on the attribute set determined by the vector base corresponding to all elements of an attribute subset A-ι, and T(A2) represents a vector on the attribute set determined by the vector base corresponding to all elements of an attribute subset A2. Any one vector of T(Ai) and T(A2) is an antecedent, and the other vector from which the antecedent is subtracted is a consequent, an attribute association rule is generated, i.e., T(A1H(T(A2)-T(A1))orT(A2H(T(A1)-T(A2)):
Thus, the confidence of the generated attribute association rules is as follows: C(T(A^(T(A2)-T(Ai)))=S(T(Ai UA2))/S(T(Ai)) or C(T(A2)—KT(At )-T(A2))) =S(T(A-i UA2))/S(T(A2)).
Embodiment 1:
An example of an information system l=<U, A, f)=K{ui,...,uio},{ai,a2,a3,a4,a5}, f) is shown as Tablel.
Table 1
In Tablel, the object vector of ui is represented as: ^=(1,0,1,0,1), i.e., the vector representation of the first row in Tablel, and other object vector representations of u, can be similarly obtained.
The attribute vector of a1 in Tablel is represented as:
i.e., the vector representations of the first column in Tablel, and other attribute vector representations of a, can be similarly obtained.
Scalar-multiplication vector operations of the object vector of Ui and the attribute vector of ai are as follows: 1 xui = u 1=(1,0,1,0,1). 0*ui = (1,1,1,1,1),
Scalar-multiplication vector operations of other object vectors and other attribute vectors can be similarly obtained.
Based on scalar-multiplication vector operation rules of the object and the attribute vectors, in the specific example, the vector bases determined by the attribute a-ι on the attribute set can be calculated as follows: B(ai) = ai® (ui, ..., Uw) = (lxUi) AiOxik) a(0xU3) a(1xU4) A(0xüb)A(lxa6) a(1x.Ut) a(1x^)a(1x
Ug) A (OxUw) =UiA(1,1, 1, 1, 1)a (1, 1,1,1, Da ima{ 1,1,1,1, Da UsAUjaus
A UgA (1, 1, 1, 1, D = (1,0,0, 0,1), and the support degree is as follows:
StBtaO) =S(a^ =(1+0+0+1+0+1+1+1+1+0)/10 =0.6o
Vector bases determined by other attributes can be obtained similarly, respectively as follows: B(a2) =(0,1,0,0,1), B(a3) =(0,0,1,0,0), B(a4) =(0,0,1,1,0), B(a5) =(0,0,0,0,1)o
In one embodiment, specially, vector bases are sorted from small to large, and the vectors on the attribute set are generated in the mode that every two vector bases are combined in the sequence from small to large, namely the vector bases (p'ii,p'i2,p'i3>p'i4,p'i5) determined by the attribute aj correspond to natural numbers p'iix24+p'i2x23+p'i3x22+p'i4x2+p'i5. Therefore, B(ai), B(a2), B(a3), B(a4) and B(a5) are sorted according to the sequencce of the respectively corresponding natural numbers from small to large. The smallest vector base is combined with other vector bases respectively to obtain new vectors, and the new vectors are inserted into sorted vector bases according to the sequence of respectively corresponding natural numbers, then the above step is executed again on the smaller vectors till no new vector is generated. The above process ensures that only two vectors participate in combination operation each time, i.e., if T(J') =(p'm, p'i2, ρ'ί3, p'i4, p'i5) and T(J") = (p”ii, p''i2. P”i3.p"i4,p',i5) are respectively the generated vectors, then the vectors generated by T(J') and T(J" )are as follows: T(J' ) V T(J" )=(p'ii, p'i2, p'i3, p'i4, p'i5)V(p"ii, p”i2, p"i3, p"i4, p"i5) = (p'ilV p"ii, p'j2V p"i2, p’i3V p”i3, p'i4v p"i4, p'i5V p"i5)o
The natural number corresponding to B(ai) is 1 χ24+0χ23+0χ22+0χ2+1 =17, the natural number corresponding to B(a2) is 9, the natural number corresponding to B(a3) is 4, the natural number corresponding to B(a4) is 6, and the natural number corresponding to B(as) is 1. Table 2 shows a result of 5 bases sorted according to the sequence of the respectively corresponding natural numbers from small to large and the support degree thereof.
Table 2
As shown in Table 2, the smallest vector base is combined with other vector bases respectively to obtain new vectors, and the new vectors are inserted into sorted vector bases according to the sequence of respectively corresponding natural numbers. Table 3 shows that B(a5) and other vector bases are respectively combined to obtain new vectors.
Table 3
Table 4 gives all the vectors, successively generated through the above process, on the attribute set.
Table 4
According to Table 4, the support degree and confidence thresholds are set as 0.5, whether the generated vectors meet the threshold or not is judged in successive two-by-two mode in the sequence from small to large, and the attribute association rules are generated. For example, starting from the smallest vector B(a5), B(as) and the vector B(a5)VB(a3) generated by B(a3) firstly meet that the support degree is greater than or equal to 0.5. Therefore, B(as) and B(a3) can generate the following attribute association rules: (0,0,0,0,1) —»(0,0,1,0,0) and (0,0,1,0,0) ->(0,0,0,0,1), i.e., a5->a3and a3-> a5, and the confidences are 5/7 and 5/8 respectively, greater than or equal to 0.5.
Other attribute association rules meeting the support degree and confidence threshold conditions can be generated similarly.
Table 5 gives attribute association rules successively generated by every two vectors and meeting the conditions.
Table 5
The association rule mining method based on vector operations of the present invention generates the vector bases on the attribute set by utilizing vector representations of the objects and attributes and the committed vector operations, characterizes the most basic correlation of the attributes, utilizes the vector bases to generate the vector topology on the attribute set, avoiding centralized power operations of the attribute set, and decreasing the operation times between the objects and attributes. The frequent closed item sets meeting the conditions in vector topology on the attribute set are sought, meanwhile, all generating elements containing minimal generating elements are include in the vector topology, decreasing the search range of the frequent closed item sets and their minimal generating elements.
Embodiment 2:
The embodiment uses an EXTENDED BAKERY Dataset, the dataset contains 75000 sales records of 40 kinds of bread (No. 1 to 40) and 10 kinde of beverages (No. 41 to 50), the mined attribute association rules reflect the association relation of the bread and the beverages purchased, the attribute association rules are mined by adopting the method, the support degree threshold is set as 0.01, the confidence threshold is set as 0, and 352 attribute association rules are generated and compared with the attribute association rules of the classic Aprior algorithm on the aspects of quantity, run time and memory occupation, wherein, the quantity of the attribute association rules and antecedents and consequents of the rules are totally consistent in content, and the run time and memory occupation are shown in Table 6.
Table 6
In a compare experiment, the embodiment conducts copying and multiplying operations on the original 75000 data for 7 times, which are increased in the scale of multiple of 2, and 8 groups of data are respectively obtained, the quantity and the support degree and confidence of the obtained rules are invariant, but the run time and the memory occupation change. Due to the multiplying processing to the data, the problem of repeated data calculation is prominent, and it can be very obviously seen that the algorithm in the prior art has shortcomings for processing the problem of repeated generation of attribute association rules. Figure 2 shows the run time curves of the algorithm provided in the present invention and the Aprior algorithm. In Figure 2, it can be clearly seen that the run time for processing the same data of the method in the present invention is remarkably shortened compared with the existing Aprior algorithm. In Table 6, the memory occupation of the method in the present invention also has greater advantage compared with the existing Aprior algorithm.
The detailed description of specific embodiments of the present invention is given above in combination with the attached drawings, but not for limiting the present invention to the above specific embodiments, and the skilled in the art may make various modifications or changes without departing from the spirit and scope claimed by the present invention.

Claims (7)

1. Een associatieregelminingmethode gebaseerd op vectorbewerkingen, gekenmerkt door de volgende stappen te omvatten: het definiëren van vectorrepresentaties van voorwerpen en attributen, en het vastleggen van bewerkingsregels van objectvectoren en attribuutvectoren, voor het berekenen van vectorbases op een attribuutstel; het uitvoeren van berekening op basis van de vectorbases voor het genereren van vectoren op het attribuutstel; het berekenen van mate van steun van elke vector op het attribuutstel op basis van de vectoren van het attribuutstel; het instellen van een steunmatedrempel van de vectorbases, en het uitfilteren van vectoren voorbij de steunmatedrempelvoorwaarde; het, op basis van een vooraf ingestelde vertrouwensdrempel, minen van attribuutassociatieregels die voldoen aan de voorwaarde in de vectoren voorbij de steunmatedrempelvoorwaarde.An association rule mining method based on vector operations, characterized by comprising the steps of: defining vector representations of objects and attributes, and recording operation rules of object vectors and attribute vectors, for calculating vector bases on an attribute set; performing calculation based on the vector bases for generating vectors on the attribute set; calculating degree of support of each vector on the attribute set based on the vectors of the attribute set; setting a support rate threshold of the vector bases, and filtering out vectors beyond the support rate threshold condition; mining, based on a preset trust threshold, attribute association rules that satisfy the condition in the vectors beyond the support measure threshold condition. 2. De associatieregelminingmethode op basis van vectorbewerkingen volgens conclusie 1, met het kenmerk dat: de stap van het definiëren van vectorrepresentaties van voorwerpen en attributen en het bepalen van bewerkingsregels van objectvectoren en attribuutvectoren omvat: het definiëren van een informatiesysteem I, gerepresenteerd als: l=(U, A, f), waarbij U een voorwerpstel representeert, A een attribuutstel representeert, waarbij U={ui,...,Un}, A^ai,...,am}, u„ het n-de element in het objectstel representeert, en am het m-de element in het attribuutstel representeert; f wordt aangeduid als een informatiefunctie van I, dat wil zeggen f:U*A—►{Ο,Ι}, voor elke (Uj.aj)eUxA, if f(Uj, aj)=py^), dan wordt aangeduid dat het i-de object Uj niet het j-de attribuut aj heeft, indien f(Uj, aj)^>jj=1, dan wordt aangeduid dat het i-de object Uj het j-de attribuut at heeft; het definiëren van Ai-»A2 als een attribuutassociatieregel, waarbij Ai, A2^ en AiI1A2=0, waarbij Ai wordt aangeduid als een antecedent en A2 wordt aangeduid als een consequent; het definiëren van Uj=(pii,...,pjm)ixm, hetgeen representeert dat het object u, gerepresenteerd kan worden als een m-dimensionale rijvector gevormd door 0 of 1; het definiëren van aj = (pij.....Pnj^ , hetgeen representeert dat attribuut at gerepresenteerd kan worden als een n-dimensionale kolomvector gevormd door 0 of 1; het vastleggen van vectorbewerkingsregels als volgt: 1 o Uj=Uj, 0 o Ui=1ixm=(1,...,1)i*m. 1 o 8]=% 0 o a,=1 nxi=(l.-,l)Li, waarbij (1,...,1)ixm een m-dimensionale rijvector representeert die alle elementen van 1 heeft, en ¢1.-*1^1 een n-dimensionale kolomvector representeert die alle elementen 1 heeft; het vastleggen van vectorbewerkingsregels tussen het attribuut aj en (ui.....un) als volgt: aj ® (i/p ···, un) = (ρυ o Ul) λ ··· λ (pnj o u0)t het vastleggen van vectorbewerkingsregels tussen het attribuut Uj en (ai,...,am) als volgt: u, ® (¾. ···, am) = (pn o a,) λ ·· λ o am), waarbij n, m, i en j alle positieve gehele getallen zijn.The association rule mining method based on vector operations according to claim 1, characterized in that: the step of defining vector representations of objects and attributes and determining operation rules of object vectors and attribute vectors comprises: defining an information system I, represented as: 1 = (U, A, f), where U represents an object set, A represents an attribute set, where U = {ui, ..., Un}, A ^ ai, ..., am}, u "the nth element in the object set, and am represents the m-th element in the attribute set; f is indicated as an information function of I, i.e. f: U * A — ► {Ο, Ι}, for each (Uj.aj) eUxA, if f (Uj, aj) = py ^), then it is indicated that the i-th object Uj does not have the j-th attribute aj, if f (Uj, aj) ^> jj = 1, then it is indicated that the i-th object Uj has the j-th attribute at; defining A1 - A2 as an attribute association rule, where A1, A2 -, and A111 - A2 = 0, where A1 is designated as an antecedent and A2 is designated as a consistent; defining Uj = (pii, ..., pjm) ixm, which represents that the object u, can be represented as an m-dimensional row vector formed by 0 or 1; defining aj = (pij ... Pnj ^, which represents that attribute at can be represented as an n-dimensional column vector formed by 0 or 1; defining vector processing lines as follows: 1 o Uj = Uj, 0 o Ui = 1ixm = (1, ..., 1) i * m. 1 o 8] =% 0 among others, = 1 nxi = (1, 1, 1) Li, where (1, ..., 1) ixm represents an m-dimensional row vector that has all the elements of 1, and ¢ 1 .- * 1 ^ 1 represents an n-dimensional column vector that has all the elements 1; capturing vector processing lines between the attribute aj and (from ..... un) as follows: aj ® (i / p ···, un) = (ρυ o Ul) λ ··· λ (pnj o u0) t the establishment of vector processing lines between the attribute Uj and (ai, ..., am) as follows: u, ® (¾. ···, am) = (pn among others,) λ ·· λ o am), where n, m, i and j are all positive integers. 3. De associatieregelminingmethode op basis van vectorbewerkingen volgens conclusie 1, met het kenmerk dat: genoemde rekenvectorbases op een attribuutstel als volgt is: het definiëren van B(aj) als representerende dat het attribuut aj een vectorbasis kan genereren, B(,at) = aJ®{uu---,u„) = (plJ οκ,)λ —Λ waarbij de verkregen vectorbasis op het attribuutstel als volgt is: B(A)={B(aj)|ajeA}, waarbij n en j beide positieve gehele getallen zijn.The association rule mining method based on vector operations according to claim 1, characterized in that: said calculation vector bases on an attribute set is as follows: defining B (aj) as representing that the attribute aj can generate a vector basis, B (, at) = aJ® {uu ---, u „) = (plJ οκ,) λ —Λ where the obtained vector basis on the attribute set is as follows: B (A) = {B (aj) | ajeA}, where n and j are both positive integers. 4. De associatieregelminingmethode op basis van vectorbewerkingen volgens conclusie 3, met het kenmerk dat: het genoemde uitvoeren van berekening op basis van de vectorbases voor het genereren van vectoren op het attribuutstel als volgt is: de vector T(J') gegenereerd door de met J’ corresponderende vectorbasis, gerepresenteerd als: T(J')=VjeJB(aj), waarbij J’ een indexstel is, waarbij alle door de met J’ corresponderende vectorbases gegenereerde vectoren worden aangeduid als T(A)={T(J')|J'£{1,2.....m}}, waarbij m en j beide positieve gehele getallen zijn.The association rule mining method based on vector operations according to claim 3, characterized in that: said performing calculation based on the vector bases for generating vectors on the attribute set is as follows: the vector T (J ') generated by the with J 'corresponding vector base, represented as: T (J') = VjeJB (aj), where J 'is an index set, all vectors generated by the vector bases corresponding to J' being designated as T (A) = {T (J ') ) | J '£ {1,2 ..... m}}, where m and j are both positive integers. 5. De associatieregelminingmethode op basis van vectorbewerkingen volgens conclusie 4, met het kenmerk dat: genoemd berekenen van mate van steun van elke vector op het attribuutstel op basis van de vectoren op het attribuutstel is als volgt: de mate van support van elke vector T(J')eT(A) is als volgt:The association rule mining method based on vector operations according to claim 4, characterized in that: said calculating degree of support of each vector on the attribute set based on the vectors on the attribute set is as follows: the degree of support of each vector T ( J ') eT (A) is as follows: S(T(J'))=(p'ij+p'2j+...+p'nj)/n, waarbij n en j beide positieve gehele getallen zijn.S (T (J ')) = (p'ij + p'2j + ... + p'nj) / n, where n and j are both positive integers. 6. De associatieregelminingmethode op basis van vectorbewerkingen volgens een willekeurige der conclusies 1-5, met het kenmerk dat: de stap van het, op basis van een vooraf ingestelde vertrouwensdrempel, minen van attribuutassociatieregels die voldoen aan de voorwaarde in de vectoren voorbij de steunmatedrempel-voorwaarde, omvat: het, op basis van een vooraf ingestelde vertrouwensdrempel van associatieregels, minen van een attribuutassociatieregel groter dan de vertrouwensdrempel in T(A).The association rule mining method based on vector operations according to any of claims 1 to 5, characterized in that: the step of, based on a preset trust threshold, minus attribute association rules meeting the condition in the vectors beyond the support dimension threshold - condition, includes: mining, based on a preset trust threshold of association rules, an attribute association rule greater than the trust threshold in T (A). 7. De associatieregelminingmethode op basis van vectorbewerkingen volgens conclusie 6, met het kenmerk dat: genoemd minen van een attribuutassociatieregel groter dan de vertrouwensdrempel is als volgt: het selecteren van twee vectoren in T(A), aangeduid als T(Ai) en T(A2), waarbij T(Ai) een vector representeert op het attribuutstel bepaald door de met alle elementen van een attribuutsubstel Ai corresponderende vectorbasis, en T(A2) een vector representeert op het attribuutstel bepaald door de met alle elementen van een attribuutsubstel A2 corresponderende vectorbasis. Elke vector van T(Ai) en T(A2) is een antecedent, en de andere vector waarvan het antecedent wordt afgetrokken is een consequent, een attribuutassociatieregel wordt gegenereerd, dat wil zeggen T(AiMT(A2)-T(Ai)) of T(A2)-^(T(A1)-T(A2)); en aldus is het vertrouwen van de gegenereerde attribuutassociatieregels als volgt: C(T(A! H(T(A2)-T(Ai )))=S(T(Ai UA2))/S(T(Ai)) of C(T(A2H(T(A1)-T(A2)))=S(T(A1UA2))/S(T(A2)).The association rule mining method based on vector operations according to claim 6, characterized in that: said mines of an attribute association rule greater than the trust threshold is as follows: selecting two vectors in T (A), designated T (Ai) and T ( A2), wherein T (Ai) represents a vector on the attribute set determined by the vector base corresponding to all elements of an attribute subset Ai, and T (A2) represents a vector on the attribute set determined by the vector base corresponding to all elements of an attribute subset A2 . Each vector of T (Ai) and T (A2) is an antecedent, and the other vector from which the antecedent is subtracted is a consistent, attribute association rule is generated, that is, T (AiMT (A2) -T (Ai)) or T (A2) - ^ (T (A1) -T (A2)); and thus the confidence of the generated attribute association rules is as follows: C (T (A! H (T (A2) -T (Ai))) = S (T (Ai UA2)) / S (T (Ai)) or C (T (A2H (T (A1) -T (A2))) = S (T (A1UA2)) / S (T (A2)).
NL1042116A 2015-10-30 2016-10-27 Association rule mining method based on vector operations NL1042116B1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510729332.8A CN105335785B (en) 2015-10-30 2015-10-30 A kind of association rule mining method based on vector operation

Publications (2)

Publication Number Publication Date
NL1042116A true NL1042116A (en) 2017-05-19
NL1042116B1 NL1042116B1 (en) 2017-09-07

Family

ID=55286300

Family Applications (1)

Application Number Title Priority Date Filing Date
NL1042116A NL1042116B1 (en) 2015-10-30 2016-10-27 Association rule mining method based on vector operations

Country Status (4)

Country Link
CN (1) CN105335785B (en)
GB (1) GB2558438A (en)
NL (1) NL1042116B1 (en)
WO (1) WO2017071005A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109120634A (en) * 2018-09-05 2019-01-01 广州视源电子科技股份有限公司 A kind of method, apparatus, computer equipment and the storage medium of port scan detection

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106021546A (en) * 2016-05-27 2016-10-12 西华大学 Minimum non-reduction association rule mining method based on item subset example tree
CN107766323B (en) * 2017-09-06 2021-08-31 淮阴工学院 Text feature extraction method based on mutual information and association rule
CN108182294B (en) * 2018-01-31 2021-04-16 湖北工业大学 Movie recommendation method and system based on frequent item set growth algorithm
CN110417594B (en) * 2019-07-29 2020-10-27 吉林大学 Network construction method and device, storage medium and electronic equipment
CN112597236B (en) * 2020-12-04 2022-10-25 河南大学 Concept lattice-based association rule optimization method and visual display method
CN113822702B (en) * 2021-08-30 2023-10-20 国网辽宁省电力有限公司阜新供电公司 Inter-industry electricity consumption demand correlation analysis system and method under emergency

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7370033B1 (en) * 2002-05-17 2008-05-06 Oracle International Corporation Method for extracting association rules from transactions in a database

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10222493A (en) * 1997-02-06 1998-08-21 Kokusai Denshin Denwa Co Ltd <Kdd> Mutual causality analysis system
CN101477375B (en) * 2009-01-05 2012-01-04 东南大学 Sensor data verification method based on matrix singular values association rules mining
CN101510204B (en) * 2009-03-02 2010-09-29 南京航空航天大学 Abnormal enquiry and monitor method based on target condition association rule database
CN101655857B (en) * 2009-09-18 2013-05-08 西安建筑科技大学 Method for mining data in construction regulation field based on associative regulation mining technology
CN102968375B (en) * 2012-11-30 2015-10-28 中国矿业大学 Based on the infeasible paths detection method of association rule mining
CN103678530A (en) * 2013-11-30 2014-03-26 武汉传神信息技术有限公司 Rapid detection method of frequent item sets

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7370033B1 (en) * 2002-05-17 2008-05-06 Oracle International Corporation Method for extracting association rules from transactions in a database

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU ZHI ET AL: "A Vector Operation Based Fast Association Rules Mining Algorithm", BIOINFORMATICS, SYSTEMS BIOLOGY AND INTELLIGENT COMPUTING, 2009. IJCBS '09. INTERNATIONAL JOINT CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 3 August 2009 (2009-08-03), pages 561 - 564, XP031530838, ISBN: 978-0-7695-3739-9 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109120634A (en) * 2018-09-05 2019-01-01 广州视源电子科技股份有限公司 A kind of method, apparatus, computer equipment and the storage medium of port scan detection
CN109120634B (en) * 2018-09-05 2021-02-05 广州视源电子科技股份有限公司 Port scanning detection method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
GB2558438A (en) 2018-07-11
CN105335785A (en) 2016-02-17
WO2017071005A1 (en) 2017-05-04
CN105335785B (en) 2017-12-19
NL1042116B1 (en) 2017-09-07
GB201803769D0 (en) 2018-04-25

Similar Documents

Publication Publication Date Title
NL1042116B1 (en) Association rule mining method based on vector operations
Deng et al. Efficient kNN classification algorithm for big data
Courty et al. Domain adaptation with regularized optimal transport
Garcke et al. Importance weighted inductive transfer learning for regression
Yu et al. Learning from multiway data: Simple and efficient tensor regression
Christen et al. A general purpose sampling algorithm for continuous distributions (the t-walk)
Iyer et al. Maximum mean discrepancy for class ratio estimation: Convergence bounds and kernel selection
Zhang et al. A convex formulation for learning task relationships in multi-task learning
US10902025B2 (en) Techniques for measuring a property of interest in a dataset of location samples
Esuli et al. A recurrent neural network for sentiment quantification
WO2021189922A1 (en) Method and apparatus for generating user portrait, and device and medium
Zheng Gradient descent algorithms for quantile regression with smooth approximation
Valkov Fitted finite volume method for a generalized Black–Scholes equation transformed on finite interval
Lampert Predicting the future behavior of a time-varying probability distribution
Loog Nearest neighbor-based importance weighting
US20210049202A1 (en) Automated image retrieval with graph neural network
Umlauft et al. Learning stochastically stable Gaussian process state–space models
Sendera et al. Non-gaussian gaussian processes for few-shot regression
Liu et al. Class specific centralized dictionary learning for face recognition
EP3965007A1 (en) Action recognition apparatus, learning apparatus, and action recognition method
Stojkovic et al. Distance Based Modeling of Interactions in Structured Regression.
Laurini et al. Forecasting the term structure of interest rates using integrated nested Laplace approximations
Kryanev et al. Metric analysis approach for interpolation and forecasting of time processes
Gaudio et al. An alternative approach to the determination of scaling law expressions for the L–H transition in Tokamaks utilizing classification tools instead of regression
JP2015038709A (en) Model parameter estimation method, device, and program

Legal Events

Date Code Title Description
MM Lapsed because of non-payment of the annual fee

Effective date: 20191101