CN111026270A - User behavior pattern mining method under mobile context awareness environment - Google Patents

User behavior pattern mining method under mobile context awareness environment Download PDF

Info

Publication number
CN111026270A
CN111026270A CN201911249315.9A CN201911249315A CN111026270A CN 111026270 A CN111026270 A CN 111026270A CN 201911249315 A CN201911249315 A CN 201911249315A CN 111026270 A CN111026270 A CN 111026270A
Authority
CN
China
Prior art keywords
seq
sequence
context
user
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911249315.9A
Other languages
Chinese (zh)
Inventor
刘彩虹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University Of Foreign Languages
Original Assignee
Dalian University Of Foreign Languages
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University Of Foreign Languages filed Critical Dalian University Of Foreign Languages
Priority to CN201911249315.9A priority Critical patent/CN111026270A/en
Publication of CN111026270A publication Critical patent/CN111026270A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2477Temporal data queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Abstract

The invention discloses a user behavior pattern mining method under a mobile context awareness environment, which comprises the following steps: (1) processing context perception data and user interaction behavior information collected by the smart phone through a context modeling module and a context reasoning module in sequence, extracting a user behavior pattern from a multi-dimensional attribute set consisting of the context perception data and the user interaction behavior information, generating a sequence rule, outputting the user behavior pattern in the form of the sequence rule and storing the user behavior pattern in a sequence rule base; (2) and when the context awareness data of the user is matched with the sequence rule front-piece in the sequence rule base, predicting the next interactive behavior of the user. The data description model and the algorithm adopted by the invention can learn the multi-dimensional sequence rule from the mobile context perception data, and further carry out service recommendation or behavior prediction.

Description

User behavior pattern mining method under mobile context awareness environment
Technical Field
The invention relates to the field of data mining, in particular to a user behavior pattern mining method under a mobile context awareness environment.
Background
The rapid development of mobile internet and sensor technologies enables various sensors and various application programs built in the smart phone to sense the situation of a user in real time and record the interaction behavior of the user and the smart phone. The relationship between the user context and the interaction behavior is called as a user behavior mode in a mobile context-aware environment, and is an important basis for providing personalized recommendation service for the user.
Rich sensor data is an important data source for user behavior pattern mining in a mobile context-aware environment, but also results in accurate recommendations becoming increasingly complex and challenging. A mobile context aware system generally comprises four modules as shown in fig. 1:
⑴ context acquisition module for collecting data source mined by user behavior pattern under mobile context-aware environment, including mobile context-aware data and user interaction behavior information;
⑵ a context modeling module for representing the mobile context awareness data and the user interaction behavior information in a unified, appropriate, computer-recognizable format;
⑶ context inference module for inferring high-level context information from low-level context information, mining the relationship between context and interactive behavior, i.e. user behavior pattern;
⑷ context application module, matching the current context of the user with the user behavior pattern generated by context inference, and triggering the active service of the smart phone.
The context modeling and the context reasoning are key steps of user behavior pattern mining in a mobile context-aware environment and are also main research contents of the invention. Schilit et al Schilit realized a first context-aware application system, MOCA system, using a key value model[1]Adopting an XML-based object-oriented model, W4Diary system[2]Structuring data using a W4(Who, What, Where, When) based context model, extracting high-level information from location data, CoBrA System[3]And a CooL system[4]An ontology-based mathematical model is used to improve the interactivity of the system. The situation representation has no unified standard, and generally needs to be combined with actual problems in application. The method adopts a non-relational data structure based on a key value model, is easy to manage and is often used in a distributed service framework, but has the defect of over-simple structure, so the structure of the method needs to be expanded when the method is applied under a mobile context-aware environment so as to meet the requirements of effective fusion, rapid growth and high concurrent access of multi-source heterogeneous sensor data[5]
The context inference methods are diverse, such as decision trees[6]Naive Bayes[7]Support vector machine[8,9]Ontology-based reasoning[10,11]Rule-based reasoning[12,13]Fuzzy reasoning[14]And the like. Lim and Dey[15]Research results show that the rule-based reasoning method is most popular due to the characteristics of intuition, easy understanding and the like, and is suitable for generating high-level situation information from low-level situation information. Due to the traditional rule mining algorithm[16]Can not be directly used for mining of user behavior sequence patterns, Mannila and the like[17]The scholars propose the MINEPI algorithm to find the partially ordered item set, Tang, etc. which occur in the same time window for multiple times in a sequence[18]The learner applies this algorithm in combination with temporal constraints to mine sequence rules in the sequence of multiple attribute descriptions. However, their work can only discover partial rules that occur repeatedly in a single sequence[19]And Harms et al[20]The problem of mining rules in multiple sequences is studied, but the algorithm tests all possible rules without any pruning strategy for the search space, which is inefficient. Fournier-Viger et al[21]The learner improves the efficiency of mining the rules in the sequences through strategies of restricting the time window, controlling the growth direction of the rules and the like. Hong et al[22]Scholars design a variable time domain pattern mining algorithm, which can not only mine frequent patterns in the whole database, but also mine the latest frequent patterns from the past to the present, but the algorithm only considers time information, and does not consider a plurality of situations such as places, weather and the like. Throughout the above research, both literature focuses on the comprehensibility of the inference method and the efficiency of the inference method, but the asymmetry of data is less considered, the context data is continuously sensed and collected by the smart phone, the interactive behavior occurs occasionally, and the context data and the interactive behavior are seriously unbalanced; secondly, the previous research only focuses on the sequence, does not consider the attributes of position, time, weather and the like, and lacks the interpretation of the interaction influence of the multidimensional situation on the user behavior; finally, most of the rules mined by the method are mined once to generate static user behavior rules, and the change trend of the user behavior cannot be reflected without considering the updating of the rules.
Disclosure of Invention
The invention aims to provide a user behavior pattern mining method in a mobile context-aware environment, which is used for designing a multi-dimensional sequence reasoning algorithm on the basis of constructing a multi-source heterogeneous sensor data description model and solving the problems of user behavior pattern mining and pattern updating in the mobile context-aware environment.
The method adopts a nested key value model to effectively fuse and store multi-source heterogeneous mobile context awareness information, constructs a rule-based multi-dimensional sequence pattern mining algorithm UTDMSP, can find globally frequent and locally frequent user behavior patterns from the mobile context awareness information and interactive behaviors of the user, identifies the behavior habits and interest preferences of the user which are kept for a long time and new changes of recent habits and preferences, utilizes the reproduced situation to push personalized customized services for the user, brings good experience effect to the user and improves the satisfaction degree of the user.
Specifically, the invention provides the following technical scheme:
a user behavior pattern mining method under a mobile context-aware environment comprises the following steps:
(1) processing context perception data and user interaction behavior information collected by the smart phone through a context modeling module and a context reasoning module in sequence, extracting a user behavior pattern from a multi-dimensional attribute set consisting of the context perception data and the user interaction behavior information, generating a sequence rule, outputting the user behavior pattern in the form of the sequence rule and storing the user behavior pattern in a sequence rule base;
(2) and when the context awareness data of the user is matched with the sequence rule front-piece in the sequence rule base, predicting the next interactive behavior of the user.
As a still further preferred aspect of the present invention, the context modeling module is configured to represent the mobile context awareness data and the user interaction behavior information in a unified, computer-recognizable format.
As a further preferable scheme of the present invention, the context inference module is configured to infer high-level context information from low-level context information, and mine a relationship between context awareness data and user interaction behavior information, that is, mine a user behavior pattern.
As a still further preferred aspect of the present invention, the method for extracting the user behavior pattern and generating the sequence rule is to use a snapshot to describe a mapping from a time domain to a multidimensional attribute space;
the snapshot is defined as: given a discrete time domain T ═ T1,t2,……,tnD to (m +1) -dimensional attribute space D ═ D1,D2,……,D(m+1)Mapping function E of }: t → D, map E (T)i)={d1,d2,……,d(m+1)Is called tiSnapshot of time, i ═ 1,2, … …, n, where dk∈dom(Dk) State representing the kth attribute, k ═ 1,2, … …, m + 1;
wherein, { D1,D2,……,DmAre different attributes of mobile context aware data, { D(m+1)Dom (D) is the interaction behavior information of the user and the smart phonek) Is the k-th attribute DkA range of values of;
the subset of snapshots is defined as the multidimensional item e, e { (d)k)│dk∈dom(Dk),1≤k≤m+1};
Multidimensional sequence SeqTIs an ordered list of non-empty tuples (e, T) defined over said discrete time domain T, denoted SeqTThe multidimensional sequence database is a set of multidimensional sequences, and is marked as SeqDB { (e, T) | T ∈ T }, where SeqDB { (Seq ∈ Seq }1,Seq2,……,SeqN},SeqjIs the jth sequence in SeqDB, j belongs to {1,2, … …, N };
in a mobile environment, user interaction behavior occurs intermittently, so allowing a certain time interval between items, two thresholds gap and width are introduced: the gap interval is the maximum difference of the occurrence time of any two adjacent terms, the width is the maximum difference of the occurrence time of the first term and the last term, a plurality of terms are allowed to occur at any time point, and the same term can also repeatedly occur at a plurality of time points;
thereby defining SeqT={(e1,t1),(e2,t2),……,(el,tl) Is the time domain T ═ T1,t2,………,tlSequence of SeqT'={(e1',t1'),(e2',t2'),……,(el',tl') } is the time domain T' ═ T1',t2',……,tlSequence on' }, SeqTIs SeqT' if and only if: 1)
Figure BDA0002308569880000041
Figure BDA0002308569880000042
e1=ej1,e2=ej2,……,el=ejl;2)t1'≤t1≤…≤tl≤tl';
definition SeqTIs SeqT' and SeqTThe time of each multidimensional item satisfies: 1) t is ti+1-ti≤gap,
Figure BDA0002308569880000043
Figure BDA0002308569880000044
……,l-1;2)tl-t1The subsequence is called Seq with width not more thanTIn the sequence SeqT'effective occurrence on', denoted as Occr (Seq)T,SeqT',gap,width);
Defining frequent subsequence Seq in sequence database Seq DB ═ { Seq1,Seq2,……,SeqNIs frequent if one of the following conditions is satisfied, i.e.
(|Occr(Seq,SeqDB,gap,width)|)/N≥MinSup, ⑵
|Occr(Seq,SeqDB,gap,width)|/(N-start(Seq)+1)≥MinSup, ⑶
Wherein MinSu is a minimum sequence support threshold set by a user;
if the formula ⑵ is satisfied, Seq is a globally frequent subsequence in the sequence database seqDB, and the sequence number range is 1-N, otherwise, whether the formula ⑶ is satisfied is judged, if so, Seq is a local frequent subsequence in the sequence number range start (Seq) -N, start (Seq) is a first sequence number which satisfies the frequent subsequence in the range of start (Seq) -N, and in the process of recursively searching for the interval of start (Seq) -N which can satisfy the formula ⑶,1 is increased every time start (Seq) and 1 is decreased every time | Occr (Seq, seqDB, gap, width) | is decreased until the formula ⑶ satisfies | Occr (Seq, seqDB, gap, width) | is 0;
defining frequent subsequence Seq { (e)1,T1),(e2,T2),……,(el-1,Tl-1),(el,Tl) Wherein (e)1,T1),(e2,T2),……,(el-1,Tl-1),(el,Tl) Is a sequence of description contexts composed of multidimensional items, and is also the prefix of the subsequence, denoted as prefix (seq), (e)l,Tl) Representing the interaction behavior of the user, the confidence of the subsequence Seq in the sequence database is SeqConf ═ (| Occr (Seq, SeqDB, gap, width) |)/(| Occr (prefix (Seq), SeqDB, gap, width) |), ⑷
If SeqConf is not lower than the minimum confidence level MinConf, a sequence rule r is generated: { e1,e2,……,el-1}→elHere, rule antecedents are limited to context data of the same dimension.
As a further preferred scheme of the present invention, the context inference module adopts a multidimensional sequence pattern mining algorithm UTDMSP, and the specific process is as follows:
step 1: scanning a primary sequence database, mapping the database into a nested key value model for storage, wherein a key of a parent layer model is a sequence described by multidimensional sensing data, a value corresponding to the key is a pair of key values, a key of a sub layer is a sequence number of the sequence, and a value of the sub layer is time information of sequence occurrence in the sequence corresponding to the sequence number;
step 2: if the sequence Seq is removed1And the first item set of the removed sequence pattern Seq2The resulting parts of the last item set of (1) are the same, then Seq may be used1And Seq2Make a connection, i.e. Seq2Is added to Seq1The tail part obtains a new sequence with the sequence number list of Seq1And Seq2Intersection of the sequence number lists;
and step 3: if an infrequent sub-sequence exists in the candidate sequence, the candidate sequence is unlikely to be frequent, and it is deleted from the candidate sequence. Calculating each candidate subsequence Seq under time constraint in a sequence database seqDB ═ { Seq1,Seq2,……,SeqNIf the global support of the candidate sequence in the sequence database is greater than or equal to the minimum support threshold, adding the candidate sequence into the frequent set; otherwise, verifying whether the support degree of the candidate item in a recent period of time meets local frequency, and if so, adding the candidate item into a frequency set;
and 4, step 4: repeating the steps 2-3 until no more candidates are generated;
and 5: and calculating the confidence of the frequent subsequence, and if the minimum confidence threshold is met and the last item of the subsequence is the user behavior, generating a rule and putting the rule into a rule set R.
And storing the user behavior sequence rules mined by the multi-dimensional sequence pattern mining algorithm UTDMSP in a sequence rule base, and waiting for the situation matching of the application module to trigger behavior prediction.
Compared with the prior art, the invention has the beneficial effects that: the data description model and algorithm adopted by the invention can learn the multidimensional sequence rule from the mobile context perception data, so as to carry out service recommendation or behavior prediction, develop the personalized recommendation service of accurate marketing by taking the user context information as the center, are in accordance with the core factors emphasized by the accurate marketing by taking the user as the core factor, provide more and more diversification of required data for enterprise marketing, and greatly increase the effectiveness of the personalized accurate marketing.
Drawings
Fig. 1 is a schematic block diagram of the prior art.
FIG. 2 is a schematic block diagram of the present technology.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, in the embodiment of the present invention, a user behavior pattern mining method in a mobile context-aware environment is provided, where first, a data model based on nested key values is constructed for the characteristics of multisource heterogeneity of mobile context-aware data, and describes mobile context information and user interaction behaviors, so as to provide knowledge support for context modeling and context inference; then, a rule-based multi-dimensional sequence pattern mining algorithm UTDMSP is constructed, and globally frequent and locally frequent user behavior patterns are mined from a sequence database; and finally, learning a multi-dimensional sequence rule from the mobile context awareness data, and further performing service recommendation or behavior prediction.
The specific frame is shown in fig. 2: (1) processing context perception data and user interaction behavior information collected by the smart phone through a context modeling module and a context reasoning module in sequence, extracting a user behavior pattern from a multi-dimensional attribute set consisting of the context perception data and the user interaction behavior information, generating a sequence rule, outputting the user behavior pattern in the form of the sequence rule and storing the user behavior pattern in a sequence rule base; (2) when the context awareness data of the user matches the sequence rule front piece in the sequence rule base, the next interactive behavior of the user, namely "context 1 ^ context 2 ^ … → behavior", is predicted, with time information both for the front piece and for the back piece.
To extract user behavior patterns in a multidimensional attribute set composed of context-aware data and user interaction behavior information, sequence rules are generated, here using "snapshots" to describe the mapping from the time domain to the multidimensional attribute space.
Definition 1: (Snapshot) given a discrete time domain T ═ T1,t2,…,tnD to (m +1) -dimensional attribute space D ═ D1,D2,…,D(m+1)Mapping function E of }: t → D, map E (T)i)={d1,d2,…,d(m+1)Is called tiSnapshot of time, i ═ 1,2, …, n, where dk∈dom(Dk) Indicating the state of the kth attribute, k ═ 1,2, …, m + 1.
In general, { D1,D2,…,DmAre different attributes of the mobile context aware data, and { D }(m+1)Dom (D) is the interaction behavior information of the user and the smart phonek) Is the k-th attribute DkThe value range of (2).
Definition 2: (multidimensional item) multidimensional item e is a subset of the snapshot, i.e.
e={(dk)│dk∈dom(Dk),1≤k≤m+1}. ⑴
Definition 3: (multidimensional sequence) multidimensional sequence SeqTIs an ordered list of non-empty tuples (e, T) defined over a discrete time domain T, denoted SeqTThe multidimensional sequence database is a set of multidimensional sequences, and is marked as SeqDB { (e, T) | T ∈ T }, where SeqDB { (Seq ∈ Seq }1,Seq2,…,SeqN},SeqjIs the jth sequence in SeqDB, j ∈ {1,2, …, N }.
For example: t is t1,t2E is T, and T1≤t2Item e1={d1Subway, d2At night, d3Raining is a situation described by location, time and weather attributes, item e2={d4The interaction behavior of the user listening to music by using the mobile phone is described, and the multidimensional sequence s { (e) } is1,t1),(e2,t2) And represents that the user uses the mobile phone to listen to music in the rainy day and evening when taking a subway.
In a mobile environment, albeit contextData is collected continuously, but user interaction behavior occurs intermittently, so some time interval should be allowed between items, where two thresholds gap and width are introduced: gap is the maximum difference in the time of occurrence of any two adjacent terms, and width is the maximum difference in the time of occurrence of the first and last terms[26]. In this context, multiple items are allowed to occur at any one point in time, and the same item may occur repeatedly at multiple points in time.
Definition 4: (subsequence) SeqT={(e1,t1),(e2,t2),…,(el,tl) Is the time domain T ═ T1,t2,…,tlSequence of SeqT'={(e1',t1'),(e2',t2'),…,(el',tl') } is the time domain T' ═ T1',t2',…,tlSequence on' }, SeqTIs SeqT' if and only if: 1)
Figure BDA0002308569880000082
e1=ej1,e2=ej2,…,el=ejl;2)t1'≤t1≤…≤tl≤tl'。
definition 5: (effective subsequence Generation) SeqTIs SeqT' and SeqTThe time of each multidimensional item satisfies: 1) t is ti+1-ti≤gap,
Figure BDA0002308569880000083
…,l-1;2)tl-t1The subsequence is called Seq with width not more thanTIn the sequence SeqT'effective occurrence on', denoted as Occr (Seq)T,SeqT',gap,width)。
As shown in Table 1, S1{ (a,1), (a,2), (a,3), (c,4), (b,5), (a,6), (c,7), (c,8), (c,9) } is a sequence over time domain {1,2, …,9}, a, b, c are states of the multi-dimensional attribute space subset, setting two thresholds: gap 2, width 5, subsequence s1={(a,3),(b,5), (c,7) } in the sequence S1Effective occurs 1 time, i.e. | Occr(s)1,S12,5) | 1, and s2={(a,1),(b,5),(c,7)}、s3When the sequence is not in the sequence S, the threshold condition is not satisfied (a,2), (b,5), (c,7) } and the sequence is not in the sequence S1Efficient generation of, | Occr(s)2,S1,2,5)|=0,|Occr(s3,S1,2,5)|=0。
TABLE 1 sequence database
Figure BDA0002308569880000081
In some cases, a sub-sequence occurs more frequently in the near future, although the average frequency of occurrence in the entire database is below the minimum support threshold, which may be that the user has attempted a new service[31]. Therefore, not only global frequent subsequences but also local frequent subsequences are found in the mobile context-aware user behavior pattern mining, and the sequence patterns of the local frequent subsequences indicate the latest trend of user preference.
Definition 6: (frequent subsequence) subsequence Seq is listed in sequence database Seq db ═ { Seq ═ Seq1,Seq2,…,SeqNIs frequent if one of the following conditions is satisfied, i.e.
(|Occr(Seq,SeqDB,gap,width)|)/N≥MinSup, ⑵
|Occr(Seq,SeqDB,gap,width)|/(N-start(Seq)+1)≥MinSup, ⑶
Where MinSup is a minimum sequence support threshold set by the user.
If formula ⑵ is true, Seq is a globally frequent subsequence in the sequence database SeqDB with a sequence number range of 1 to N, otherwise, it is determined whether formula ⑶ is true, and if true, Seq is a locally frequent subsequence in the sequence number range start (Seq) to N, start (Seq) is the first sequence number that satisfies the frequent subsequence in the range of start (Seq) to N, and every time start (Seq) is incremented by 1 and | Occr (Seq, SeqDB, gap, width) | is decremented by 1 in the recursive search for a range of start (Seq) to N that can satisfy formula ⑶, until formula ⑶ satisfies or | Occr (Seq, SeqDB, gap, width) | is 0.
As shown in Table 1, the sequence database S ═ S1,S2,S3,S4,S5Let MinSup be 0.6, and a, b, c, d, e, f are subsequences (length 1) in the sequence database, from which two points can be observed: first, the support of the subsequence c in the sequence database is considered to be a frequent subsequence although satisfying the minimum support, but two sequences S in the near future4And S5May be a service item that the user has no longer used; second, subsequence e, although not frequently occurring throughout the sequence database, is numbered in sequence [4, 5]]Satisfies the equation ⑶, which to some extent indicates the creation of new preferences and behaviors by the user.
Definition 7: (sequence rules) frequent subsequence Seq { (e)1,T1),(e2,T2),…,(el-1,Tl-1),(el,Tl) Wherein (e)1,T1),(e2,T2),…,(el-1,Tl-1),(el,Tl) Is a sequence of description contexts composed of multidimensional items, and is also the prefix of the subsequence, denoted as prefix (seq), (e)l,Tl) Representing the interaction behavior of the user, the confidence of the subsequence Seq in the sequence database is SeqConf ═ (| Occr (Seq, SeqDB, gap, width) |)/(| Occr (prefix (Seq), SeqDB, gap, width) |), ⑷
If SeqConf is not lower than the minimum confidence level MinConf, a sequence rule r may be generated: { e1,e2,…,el-1}→elHere, rule antecedents are limited to context data of the same dimension.
And (3) mining the user behavior pattern under the mobile context-aware environment, namely searching a global frequent subsequence and a local frequent subsequence from a sequence database SeqDB, and then picking out all subsequences with confidence degrees larger than or equal to a user-defined threshold MinConf from the sequences to generate a sequence pattern rule.
The updated multi-dimensional sequence pattern (UTDMSP) mining algorithm can not only mine consistent behavior preference of the user, but also can gain insight into new trends of user behavior. The user behavior pattern mining algorithm under the mobile context-aware environment stores a multidimensional sequence database by using a nested key value model, and pruning is performed by combining gap, width time threshold and Apriori attribute, so that the generation of meaningless candidate sets is reduced, a sliding window is not completely relied on, and the problem of mobile context-aware data sequence pattern mining can be effectively solved. The algorithm flow is as follows: :
step 1: scanning a primary sequence database, mapping the database into a nested key value model for storage, wherein a key of a parent layer model is a sequence described by multidimensional sensing data, a value corresponding to the key is a pair of key values, a key of a sub layer is a sequence number of the sequence, and a value of the sub layer is time information of sequence occurrence in the sequence corresponding to the sequence number;
step 2: if the sequence Seq is removed1And the first item set of the removed sequence pattern Seq2The resulting parts of the last item set of (1) are the same, then Seq may be used1And Seq2Make a connection, i.e. Seq2Is added to Seq1The tail part obtains a new sequence with the sequence number list of Seq1And Seq2Intersection of the sequence number lists;
and step 3: if an infrequent sub-sequence exists in the candidate sequence, the candidate sequence is unlikely to be frequent, and it is deleted from the candidate sequence. Calculating each candidate subsequence Seq under time constraint in a sequence database seqDB ═ { Seq1,Seq2,……,SeqNIf the global support of the candidate sequence in the sequence database is greater than or equal to the minimum support threshold, adding the candidate sequence into the frequent set; otherwise, verifying whether the support degree of the candidate item in a recent period of time meets local frequency, and if so, adding the candidate item into a frequency set;
and 4, step 4: repeating the steps 2-3 until no more candidates are generated;
and 5: and calculating the confidence of the frequent subsequence, and if the minimum confidence threshold is met and the last item of the subsequence is the user behavior, generating a rule and putting the rule into a rule set R.
And storing the user behavior sequence rules mined by the multi-dimensional sequence pattern mining algorithm UTDMSP in a sequence rule base, and waiting for the situation matching of the application module to trigger behavior prediction.
In conclusion, on the basis of constructing a multi-source heterogeneous sensor data description model, the multi-dimensional sequence reasoning algorithm is designed, and the problems of user behavior pattern mining and pattern updating in a mobile context-aware environment are solved.
⑴ formally describing the mining problem of the multi-dimensional sequence mode of the multi-source heterogeneous sensor data collected by the smart phone;
⑵, a nested key value model is proposed to carry out data representation and storage on the situation and the user interaction behavior recorded by the smart phone;
⑶, a multi-dimensional sequence pattern mining algorithm UTDMSP is constructed, frequent subsequences with global frequency not lower than the minimum support degree in a sequence database are found, and subsequences with local frequency meeting the support degree threshold value in a recent period, namely new trends and new dynamics of user behaviors, can also be found on the basis of finding the global frequent subsequences.
The invention provides an algorithm for mining user behavior patterns in a mobile context-aware environment, which is characterized in that firstly, a data model based on nested key values is constructed aiming at the characteristic of multi-source isomerism of mobile context-aware data, mobile context information and user interaction behaviors are described, and knowledge support is provided for context modeling and context reasoning; then, a rule-based multi-dimensional sequence pattern mining algorithm UTDMSP is constructed, and globally frequent and locally frequent user behavior patterns are mined from a sequence database; finally, the data description model and algorithm adopted by the invention can learn the multidimensional sequence rule from the mobile context perception data, and further carry out service recommendation or behavior prediction.
The mobile context awareness data from various sources can accurately describe the environment where the user is located, but the more specific context, the lower the reproduction probability, so that the rule matching is restricted, for example, some sequence rules are temporarily put aside because of no corresponding matching context, and the validity of the rule cannot be proved, so that the time constraint condition can be properly relaxed when the algorithm is actually applied.
And the personalized recommendation service for the precise marketing is developed by taking the user situation information as the center, and is matched with the core factor of the precise marketing, which is emphasized by taking the user as the core factor. The rapidly developed information technology provides more and more diversification of required data for enterprise marketing, and the effectiveness of personalized accurate marketing is greatly improved. In the future, the experience of the context-aware service is further improved based on the exploration of a big data mining and analyzing method, and the continuous development of a business model of the context-aware service is promoted.
Some of the references in this invention are annotated as follows:
[1]Sacramento V,Endler M,Rubinsztejn H K,et al.MoCA:A middleware fordeveloping collaborative applications for mobile users[J].IEEE DistributedSystems Online,2004,5(10):1–14.
[2]Castelli G,Mamei M,Rosi A,et al.Extracting high-level informationfrom location data:the w4 diary example[J].Mobile Networks and Applications,2009,14(1):107-119.
[3]Chen H,Finin T,Joshi A.Using OWL in a pervasive computing broker[C]//Proceedings of the 3rd Workshop on Ontologies in Agent Systems,2003:9-16.
[4]Strang T,Linnhoff-Popien C,Frank K.CoOL:A context ontologylanguage to enable contextual interoperability[C]//IFIP InternationalConference on Distributed Applications and Interoperable Systems,2003:236-247.
[5] muntinge, wang fan, stanzi, etc. mobile user demand acquisition technology and its applications [ J ] software bulletin, 2014,25 (3): 439-456.
[6] Wang Yi. evaluation of consumer credit based on combinatorial classification [ J ] management Engineers, 2015,29 (1): 30-38.
[7] Zhangieyi, Wangbianzhu, Chenxuelong, and the like, emergency decision-oriented situation modeling method [ J ]. System engineering report, 2018,33 (1): 1-12.
[8] Luda ming, in the shore, a bus operation coordination control punctuation station scheduling model [ J ]. proceedings of systems engineering, 2012,27 (2): 248-255.
[9] Huang run Peng, left civilization, Pirina stock market forecast [ J ] management engineering newspaper based on microblog mood information, 2015,29 (1): 47-52.
[10] The knowledge situation-based knowledge reuse and innovation mechanism research [ J ]. management engineering newspaper, 2009,23 (2): 7-10.
[11] Wang Chinese, Gao Chao, Yuan Ye, etc. A semantic redundancy discovery method based on network analysis [ J ] complex system and complexity science, 2017 (1): 58-65.
[12] Scenario-based incident evolution probability rule inference method [ J ] systematic engineering newspaper, 2014,29 (4): 571-578.
[13] Hubei. context aware information push architecture [ J ] book and intelligence based on rules 2015,159 (3): 110-117.
[14] The system is a vehicle-mounted multi-sensor omnibearing alcohol detection system [ J ] based on a fuzzy reasoning principle, and is reported by university of North and Central China: natural science edition, 2014,35 (4): 479-484.
[15]Lim B Y,Dey A K.Toolkit to support intelligibility in context-aware applications[C]//Proceedings of the 12th ACM International Conferenceon Ubiquitous Computing,2010:13-22.
[16]Agrawal B R,Srikant R.A fast algorithm for mining associationrules[C]//Proceedings of International Conference on Very Large Data Bases,1994:21-30.
[17]Mannila H,Toivonen H.Discovering generalized episodes usingminimal occurrences[C]//Proceedings of the 2nd International Conference onKnowledge Discovery and Data Mining,1996:146–151.
[18]Tang H,Liao S S,Sun S X.A prediction framework based oncontextual data to support mobile personalized marketing[J].Decision SupportSystems,2013,56(12):234-246.
[19]Das G,Lin K I,Mannila H,et al.Rule discovery from time series[C]//Proceedings of the 4th Annual Conference on Knowledge Discovery and DataMining,1998:16-22.
[20]Harms S K,Deogun J,Tadesse T.Discovering sequential associationrules with constraints and time lags in multiple sequences[C]//Proceedings ofthe 13th International Symposium on Methodologies for Intelligent Systems,2002:432-441.
[21]Fournier-Viger P,Wu C W,Tseng V S,et al.Mining partially-orderedsequential rules common to multiple sequences[J].IEEE Transactions onKnowledge and Data Engineering,2015,27(8):2203-2216.
[22]Hong T P,Wu Y Y,Wang S L.An effective mining approach for up-to-date patterns[J].Expert Systems with Applications,2009,36(6):9747-9752.
[26]Mannila H,Toivonen H.Discovering generalized episodes usingminimal occurrences[C]//Proceedings of the 2nd International Conference onKnowledge Discovery and Data Mining,1996:146–151.
[31]Hong T P,Wu Y Y,Wang S L.An effective mining approach for up-to-date patterns[J].Expert Systems with Applications,2009,36(6):9747-9752。
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (6)

1. A user behavior pattern mining method under a mobile context-aware environment is characterized by comprising the following steps:
(1) processing context perception data and user interaction behavior information collected by the smart phone through a context modeling module and a context reasoning module in sequence, extracting a user behavior pattern from a multi-dimensional attribute set consisting of the context perception data and the user interaction behavior information, generating a sequence rule, outputting the user behavior pattern in the form of the sequence rule and storing the user behavior pattern in a sequence rule base;
(2) and when the context awareness data of the user is matched with the sequence rule front-piece in the sequence rule base, predicting the next interactive behavior of the user.
2. The method of claim 1, wherein the context modeling module is configured to represent the mobile context-aware data and the user interaction behavior information in a unified, computer-recognizable format.
3. The method as claimed in claim 2, wherein the context inference module is configured to infer high-level context information from low-level context information, and mine a relationship between context-aware data and user interaction behavior information.
4. The method of claim 3, wherein mining the relationship between context-aware data and user interaction behavior information is mining user behavior patterns.
5. The method for mining the user behavior pattern in the mobile context-aware environment according to claim 4, wherein the method for extracting the user behavior pattern and generating the sequence rule is to use a snapshot to describe the mapping from the time domain to the multidimensional attribute space;
the snapshot is defined as: given a discrete time domain T ═ T1,t2,……,tnD to (m +1) -dimensional attribute space D ═ D1,D2,……,D(m+1)Mapping function E of }: t → D, map E (T)i)={d1,d2,……,d(m+1)Is called tiSnapshot of time, i ═ 1,2, … …, n, where dk∈dom(Dk) State representing the kth attribute, k ═ 1,2, … …, m + 1;
wherein, { D1,D2,……,DmAre different attributes of mobile context aware data, { D(m+1)Dom (D) is the interaction behavior information of the user and the smart phonek) Is the k-th attribute DkA range of values of;
the subset of snapshots is defined as the multidimensional item e, e { (d)k)│dk∈dom(Dk),1≤k≤m+1};
Multidimensional sequence SeqTIs an ordered list of non-empty tuples (e, T) defined over said discrete time domain T, denoted SeqTThe multidimensional sequence database is a set of multidimensional sequences, and is marked as SeqDB { (e, T) | T ∈ T }, where SeqDB { (Seq ∈ Seq }1,Seq2,……,SeqN},SeqjIs the jth sequence in SeqDB, j belongs to {1,2, … …, N };
in a mobile environment, user interaction behavior occurs intermittently, so allowing a certain time interval between items, two thresholds gap and width are introduced: the gap interval is the maximum difference of the occurrence time of any two adjacent terms, the width is the maximum difference of the occurrence time of the first term and the last term, a plurality of terms are allowed to occur at any time point, and the same term can also repeatedly occur at a plurality of time points;
thereby defining SeqT={(e1,t1),(e2,t2),……,(el,tl) Is the time domain T ═ T1,t2,………,tlSequence of SeqT'={(e1',t1'),(e2',t2'),……,(el',tl') } is the time domain T' ═ T1',t2',……,tlSequence on' }, SeqTIs SeqT' if and only if: 1)
Figure FDA0002308569870000021
Figure FDA0002308569870000022
e1=ej1,e2=ej2,……,el=ejl;2)t1'≤t1≤…≤tl≤tl';
definition SeqTIs SeqT' and SeqTThe time of each multidimensional item satisfies: 1) t is ti+1-ti≤gap,
Figure FDA0002308569870000023
Figure FDA0002308569870000024
Then the subsequence Seq is calledTIn the sequence SeqT'effective occurrence on', denoted as Occr (Seq)T,SeqT',gap,width);
Defining frequent subsequence Seq in sequence database Seq DB ═ { Seq1,Seq2,……,SeqNIs frequent if one of the following conditions is satisfied, i.e.
(|Occr(Seq,SeqDB,gap,width)|)/N≥MinSup, ⑵
|Occr(Seq,SeqDB,gap,width)|/(N-start(Seq)+1)≥MinSup, ⑶
Wherein MinSu is a minimum sequence support threshold set by a user;
if the formula ⑵ is satisfied, Seq is a globally frequent subsequence in the sequence database seqDB, and the sequence number range is 1-N, otherwise, whether the formula ⑶ is satisfied is judged, if so, Seq is a local frequent subsequence in the sequence number range start (Seq) -N, start (Seq) is a first sequence number which satisfies the frequent subsequence in the range of start (Seq) -N, and in the process of recursively searching for the interval of start (Seq) -N which can satisfy the formula ⑶,1 is increased every time start (Seq) and 1 is decreased every time | Occr (Seq, seqDB, gap, width) | is decreased until the formula ⑶ satisfies | Occr (Seq, seqDB, gap, width) | is 0;
defining frequent subsequence Seq { (e)1,T1),(e2,T2),……,(el-1,Tl-1),(el,Tl) Wherein (e)1,T1),(e2,T2),……,(el-1,Tl-1),(el,Tl) Is a sequence of description contexts composed of multidimensional items, and is also the prefix of the subsequence, denoted as prefix (seq), (e)l,Tl) Representing the interaction behavior of the user, the confidence of the subsequence Seq in the sequence database is
SeqConf=(|Occr(Seq,SeqDB,gap,width)|)/(|Occr(prefix(Seq),SeqDB,gap,width)|), ⑷
If SeqConf is not lower than the minimum confidence level MinConf, a sequence rule r is generated: { e1,e2,……,el-1}→elHere, rule antecedents are limited to context data of the same dimension.
6. The method for mining the user behavior pattern under the mobile context-aware environment according to claim 5, wherein the context inference module adopts a multi-dimensional sequence pattern mining algorithm UTDMSP, and the specific process is as follows:
step 1: scanning a primary sequence database, mapping the database into a nested key value model for storage, wherein a key of a parent layer model is a sequence described by multidimensional sensing data, a value corresponding to the key is a pair of key values, a key of a sub layer is a sequence number of the sequence, and a value of the sub layer is time information of sequence occurrence in the sequence corresponding to the sequence number;
step 2: if the sequence Seq is removed1And the first item set of the removed sequence pattern Seq2The resulting parts of the last item set of (1) are the same, then Seq may be used1And Seq2Make a connection, i.e. Seq2Is added to Seq1The tail part obtains a new sequence with the sequence number list of Seq1And Seq2Intersection of the sequence number lists;
and step 3: if an infrequent sub-sequence exists in the candidate sequence, the candidate sequence is unlikely to be frequent, and it is deleted from the candidate sequence. Calculating each candidate subsequence Seq under time constraint in a sequence database seqDB ═ { Seq1,Seq2,……,SeqNIf the global support of the candidate sequence in the sequence database is greater than or equal to the minimum support threshold, adding the candidate sequence into the frequent set; otherwise, verifying whether the support degree of the candidate item in a recent period of time meets local frequency, and if so, adding the candidate item into a frequency set;
and 4, step 4: repeating the steps 2-3 until no more candidates are generated;
and 5: and calculating the confidence of the frequent subsequence, and if the minimum confidence threshold is met and the last item of the subsequence is the user behavior, generating a rule and putting the rule into a rule set R.
CN201911249315.9A 2019-12-09 2019-12-09 User behavior pattern mining method under mobile context awareness environment Pending CN111026270A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911249315.9A CN111026270A (en) 2019-12-09 2019-12-09 User behavior pattern mining method under mobile context awareness environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911249315.9A CN111026270A (en) 2019-12-09 2019-12-09 User behavior pattern mining method under mobile context awareness environment

Publications (1)

Publication Number Publication Date
CN111026270A true CN111026270A (en) 2020-04-17

Family

ID=70205253

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911249315.9A Pending CN111026270A (en) 2019-12-09 2019-12-09 User behavior pattern mining method under mobile context awareness environment

Country Status (1)

Country Link
CN (1) CN111026270A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090210373A1 (en) * 2008-02-20 2009-08-20 Matsushita Electric Industrial Co., Ltd. System architecture and process for seamless adaptation to context aware behavior models
US20120137367A1 (en) * 2009-11-06 2012-05-31 Cataphora, Inc. Continuous anomaly detection based on behavior modeling and heterogeneous information analysis
CN106651606A (en) * 2016-11-29 2017-05-10 河南科技大学 Multimedia social network user behavior pattern discovery method
CN107943946A (en) * 2017-11-24 2018-04-20 重庆科技学院 Relevance method for digging between test item bank knowledge point based on Apriori algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090210373A1 (en) * 2008-02-20 2009-08-20 Matsushita Electric Industrial Co., Ltd. System architecture and process for seamless adaptation to context aware behavior models
US20120137367A1 (en) * 2009-11-06 2012-05-31 Cataphora, Inc. Continuous anomaly detection based on behavior modeling and heterogeneous information analysis
CN106651606A (en) * 2016-11-29 2017-05-10 河南科技大学 Multimedia social network user behavior pattern discovery method
CN107943946A (en) * 2017-11-24 2018-04-20 重庆科技学院 Relevance method for digging between test item bank knowledge point based on Apriori algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CAIHONG LIU, CHONGHUI GUO: "A Framework of Mobile Context-Aware Recommender System" *
CAIHONG LIU; CHONGHUI GUO: "A Framework of Mobile Context-Aware Recommender System" *
TANG, H., LIAO, S.S., SUN, S.X: "A prediction framework based on contextual data to support mobile personalized marketing" *

Similar Documents

Publication Publication Date Title
JP7201730B2 (en) Intention recommendation method, device, equipment and storage medium
US20220292103A1 (en) Information service for facts extracted from differing sources on a wide area network
JP5092165B2 (en) Data construction method and system
Boettcher Contrast and change mining
JP2005242998A (en) Selective multiplex level expansion of database via by pivot point data
Shaker et al. Evolving fuzzy pattern trees for binary classification on data streams
Thabtah et al. A new Classification based on Association Algorithm
CN115269877A (en) Method, system and equipment for constructing domain entity and event double-center knowledge graph
Hasan et al. Simclus: an effective algorithm for clustering with a lower bound on similarity
Jiang et al. Incremental evaluation of top-k combinatorial metric skyline query
Hassani et al. On the application of sequential pattern mining primitives to process discovery: Overview, outlook and opportunity identification
Jaffali et al. Survey on social networks data analysis
Yasir et al. D-GENE: deferring the GENEration of power sets for discovering frequent itemsets in sparse big data
Freeman et al. Web content management by self-organization
Yasir et al. TRICE: Mining frequent itemsets by iterative TRimmed transaction LattICE in sparse big data
CN111026270A (en) User behavior pattern mining method under mobile context awareness environment
US11829379B2 (en) Methods and systems of a matching platform for entitites
Lim et al. A modular approach to landmark detection based on a Bayesian network and categorized context logs
Kumar et al. Classification of Mobile Applications with rich information
ElGindy et al. Enriching user profiles using geo-social place semantics in geo-folksonomies
Jea et al. Discovering frequent itemsets by support approximation and itemset clustering
Amja et al. Modeling and reasoning in context-aware systems based on relational concept analysis and description logic
Bae et al. SD-Miner: A spatial data mining system
CN113792202B (en) User classification screening method
Mohbey et al. Framework for finding maximal association rules in mobile web service environment using soft set

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination