CN113505156A - Transaction data frequent sequence pattern mining method based on improved Prefix span algorithm - Google Patents

Transaction data frequent sequence pattern mining method based on improved Prefix span algorithm Download PDF

Info

Publication number
CN113505156A
CN113505156A CN202110777271.8A CN202110777271A CN113505156A CN 113505156 A CN113505156 A CN 113505156A CN 202110777271 A CN202110777271 A CN 202110777271A CN 113505156 A CN113505156 A CN 113505156A
Authority
CN
China
Prior art keywords
prefix
transaction data
commodity transaction
database
transaction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110777271.8A
Other languages
Chinese (zh)
Inventor
何新
王子龙
陈琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Rongxin Intelligent Technology Co ltd
Original Assignee
Nanjing Rongxin Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Rongxin Intelligent Technology Co ltd filed Critical Nanjing Rongxin Intelligent Technology Co ltd
Priority to CN202110777271.8A priority Critical patent/CN113505156A/en
Publication of CN113505156A publication Critical patent/CN113505156A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2425Iterative querying; Query formulation based on the results of a preceding query
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Finance (AREA)
  • Software Systems (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Fuzzy Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a transaction data frequent sequence pattern mining method based on an improved Prefix span algorithm, which comprises the following steps: preprocessing commodity transaction data to obtain a commodity transaction data set, and storing the commodity transaction data set in a transaction sequence database; scanning a transaction sequence database, counting each single item to obtain the sequence support degree of each single item, arranging in a descending order, and selecting the single item of the front mu item and meeting the minimum support degree as an initial prefix; depth-first traversal is adopted, the position of the first initial prefix is calculated and stored in a prefix position information table, and a commodity transaction projection database is generated; iterating the commodity transaction projection database until a new commodity transaction projection database cannot be generated, and storing a frequent sequence mode set generated by each commodity transaction projection database; and repeating the previous step from the second initial prefix until all the initial prefixes are calculated. The invention is used for reducing the time/space consumption of frequent sequence pattern mining of transaction data and improving the execution efficiency.

Description

Transaction data frequent sequence pattern mining method based on improved Prefix span algorithm
Technical Field
The invention relates to the technical field of transaction data mining, in particular to a transaction data frequent sequence pattern mining method based on an improved Prefix span algorithm.
Background
The transaction data of the large chain supermarket has a series of user transaction databases, each record comprises the ID of a user, the time when the transaction occurs and the items related to the transaction, and if a mode related to the association relationship between the transactions, namely the connection between a plurality of times of purchasing behaviors of the user can be mined, more targeted marketing measures can be taken.
At present, in the trading data frequent sequence pattern mining algorithm, a lot of time and energy are spent by experts and scholars to provide a plurality of typical methods, such as GSP, SPADE, Prefix span algorithm and the like. The GSP algorithm reduces the number of candidate sequences to be scanned and the generation of redundant useless patterns, but generates a large number of candidate sequence patterns aiming at a large-scale sequence database and needs to circularly scan the sequence database; the SPADE algorithm only reduces the number of times of scanning the database to 3 times, but generates a large number of vertical databases under the condition of huge original data; the advantages of the Prefix span algorithm are that no candidate sequence is generated, compared with other two algorithms, the memory consumption is relatively stable, the efficiency is higher, but the problem of repeated projection database may occur, so that the repeated projection database is mined and divided, certain repeated calculation is caused, and the time/space consumption is increased. Therefore, a frequent sequence pattern mining method for transaction data based on an improved Prefix span algorithm is urgently needed to be researched.
Disclosure of Invention
The invention aims to provide a transaction data frequent sequence pattern mining method based on an improved Prefix span algorithm, which is used for reducing the time/space consumption of transaction data frequent sequence pattern mining and improving the execution efficiency.
In order to achieve the purpose, the invention provides the following scheme:
a transaction data frequent sequence pattern mining method based on an improved Prefix span algorithm comprises the following steps:
s1) preprocessing the acquired commodity transaction data to obtain a commodity transaction data set, and storing the commodity transaction data set in a transaction sequence database D;
s2) scanning the transaction sequence database D, counting each single item with the length of 1 to obtain the sequence support degree sup of each single item, arranging in a descending order, and selecting the single item with the first mu item and the minimum support degree min _ sup as an initial prefix;
s3), depth-first traversal is adopted, the position of the first initial prefix is calculated and stored in the prefix position information table, and a commodity transaction projection database is generated; iterating the commodity transaction projection database until a new commodity transaction projection database cannot be generated, and storing a frequent sequence mode set generated by each commodity transaction projection database;
s4) repeating the step S3) from the second initial prefix until all initial prefixes are calculated;
wherein, the step S4) specifically includes:
s401) generating a commodity transaction projection database with a second initial prefix; if the commodity transaction projection database is empty, recursively returning;
s402) scanning a commodity transaction projection database, and counting the single items; if the sequence support degree sup of all the single items is lower than the minimum support degree min _ sup, recursively returning;
s403) merging each single item meeting the minimum support degree min _ sup with the current prefix to obtain a plurality of new prefixes, and calculating prefix positions of the new prefixes; if the prefix position information table has the prefix with the same position as the previous prefix, directly returning a frequent sequence mode set generated by the prefix in the prefix position information table, and returning to the step S3); otherwise, the prefix position information table stores new prefix position information, generates new commodity transaction projection data, and returns to step S401).
Optionally, the preprocessing of the obtained commodity transaction data in step S1) includes completing or deleting missing or repeated order records, and correcting data with record errors.
Optionally, the top μ item in step S2) is a valid item number, which indicates a category of a main commodity sold by the vending machine.
Alternatively, the number of available items μmay be set based on the mechanical structure of the vending machine, the container capacity, and the number of lanes, or may be set based on the number of types of main products set by the administrator, or may be set based on a combination of the two.
Optionally, the prefix location information table in step S3) is stored by a Hash table.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects: the transaction data frequent sequence pattern mining method based on the improved Prefix span algorithm provided by the invention avoids repeated calculation of a repeated projection database by the aid of the prefix position information table and by adopting depth-first traversal, reduces the time/space consumption of transaction data frequent sequence pattern mining, and improves the execution efficiency.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flow chart of a transaction data sequence pattern mining method based on an improved Prefix span algorithm according to an embodiment of the present invention;
FIG. 2 is a flow chart of the Prefix span algorithm according to the embodiment of the present invention;
FIG. 3 is a partial diagram of merchandise transaction data according to an embodiment of the invention;
FIG. 4 is a partial view of a merchandise transaction data set according to an embodiment of the invention;
FIG. 5 is a comparison graph of the execution efficiency before and after the Prefix span algorithm is improved according to the embodiment of the present invention;
fig. 6 is a comparison diagram of the execution space before and after the improvement of the PrefixSpan algorithm according to the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a transaction data frequent sequence pattern mining method based on an improved Prefix span algorithm, which is used for reducing the time/space consumption of transaction data frequent sequence pattern mining and improving the execution efficiency.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
As shown in fig. 1 to fig. 2, the frequent sequence pattern mining method for transaction data based on the improved PrefixSpan algorithm provided by the embodiment of the present invention includes the following steps:
s1) preprocessing the acquired commodity transaction data to obtain a commodity transaction data set, and storing the commodity transaction data set in a transaction sequence database D;
s2) scanning the transaction sequence database D, counting each single item with the length of 1 to obtain the sequence support degree sup of each single item, arranging in a descending order, and selecting the single item with the first mu item and the minimum support degree min _ sup as an initial prefix; the first mu item is effective item number and indicates the type of main goods sold by the vending machine; the number mu of effective items is set according to the mechanical structure of the vending machine, the container capacity and the number of goods channels, or is set according to the number of main commodity types set by a manager, or is set according to the combination of the two;
s3), depth-first traversal is adopted, the position of the first initial prefix is calculated and stored in the prefix position information table, and a commodity transaction projection database is generated; iterating the commodity transaction projection database until a new commodity transaction projection database cannot be generated, and storing a frequent sequence mode set generated by each commodity transaction projection database; the prefix position information table is stored through a Hash table;
s4) repeating the step S3) from the second initial prefix until all initial prefixes are calculated;
wherein, the step S4) specifically includes:
s401) generating a commodity transaction projection database with a second initial prefix; if the commodity transaction projection database is empty, recursively returning;
s402) scanning a commodity transaction projection database, and counting the single items; if the sequence support degree sup of all the single items is lower than the minimum support degree min _ sup, recursively returning;
s403) merging each single item meeting the minimum support degree min _ sup with the current prefix to obtain a plurality of new prefixes, and calculating prefix positions of the new prefixes; if the prefix position information table has the prefix with the same position as the previous prefix, directly returning a frequent sequence mode set generated by the prefix in the prefix position information table, and returning to the step S3); otherwise, the prefix position information table stores new prefix position information, generates new commodity transaction projection data, and returns to step S401).
Experiments are used below to verify that the embodiments of the present invention improve the characteristics of the PrefixSpan algorithm (ppfprefixsspan) and compare it with the PrefixSpan algorithm (PrefixSpan Base on Effective Items, EIPrefixSpan) that adds significant terms and the unmodified PrefixSpan algorithm.
As shown in fig. 3, the test commodity transaction data for this experiment was the total sales records from 1/2015 to 4/2015 for 30/2015 in a retail supermarket. 42817 records of 2000 users purchasing goods in four months are covered in the data set, and are classified into 17 columns of ' customer number ', ' major code ', ' major name ', ' middle code ', ' minor name ', ' date of sale ', ' month of sale ', ' goods code ', ' specification model ', ' goods type ', ' unit ', ' quantity of sale ', ' amount of sale ', ' unit price ', ' sales promotion ' or not ' in the data set, wherein the data set classifies the goods into 4 goods types, 15 goods major classes, 176 goods middle classes, 759 goods minor classes. According to the practical situation of the intelligent vending machine system, 176 commodity classes are selected as a feature item set, and the experimental data are respectively processed by using three algorithms of a Prefix span algorithm, an EIPrefix span algorithm and a PPFPrefix span algorithm. All experimental related programs are written by python, and the running environment of the software is windows.
Firstly, preprocessing the acquired commodity transaction data, wherein the preprocessing comprises completing or deleting missing or repeated order records and correcting data with recording errors. Then, 5 columns of data of 'customer number', 'middle class code', 'middle class name', 'sales date', 'sales month' are saved in the preprocessed 17 columns of data, as shown in fig. 4.
The execution efficiency and the execution space of the EIPrefixSpan algorithm and the ppfprefixsspan algorithm with the support degrees sup of 2%, 4%, 6%, 8%, 10%, 12%, 14%, 16%, and μ ═ 30 are respectively taken from the commodity transaction data set, and the execution efficiency comparison experiment results are shown in table 1, and the comparison graph is shown in fig. 5.
Table 1 results of performance efficiency comparison experiment
Figure BDA0003156069430000051
As shown in fig. 5, it can be found that the execution efficiency of the PPFPrefixSpan algorithm is further optimized compared with the EIPrefixSpan algorithm and is significantly better than the PrefixSpan algorithm, and the PPFPrefixSpan algorithm avoids the frequent sequence generated by the repeated recursion of the repeated projection database through the prefix position information table and the depth-first traversal in the sequence pattern mining process, and reduces the algorithm operation time, so that the algorithm has a more data set in the repeated projection database, and the algorithm effect is more significant. Therefore, the mining efficiency of the sequence pattern is improved.
The results of the performed spatial comparison experiments are shown in table 2, and the comparison graph is shown in fig. 6.
Table 2 results of the experiments performed with spatial contrast
Figure BDA0003156069430000061
As shown in fig. 6, it can be observed that the execution space of the PPFPrefixSpan algorithm is further optimized compared with the EIPrefixSpan algorithm, and is significantly better than the PrefixSpan algorithm, and the PPFPrefixSpan algorithm avoids generating a repeated projection database in the sequence pattern mining process, thereby reducing the memory usage required by the PPFPrefixSpan algorithm in the operation process.
The experimental results show that when the number of repeated projection databases generated for the data set in the arithmetic operation process is large, the PPFPrefixSpan algorithm saves time/space resources compared with the EIPreFIxSpan algorithm, and proves that the algorithm optimization of the PPFPrefixSpan algorithm by introducing the prefix position information table is really feasible.
The idea of the EIPrefixSpan algorithm is to introduce a concept of a significant term μ in the first step of the algorithm based on the PrefixSpan algorithm. The main idea of the algorithm is that when the vending machine is matched in a real-world manner, due to the limited volume of the vending machine for placing articles, only a few commodities with the best sale condition can be selected in fact, and the related commodities with the highest support degree are matched, but the data of the retail supermarket of the user often comprises tens of hundreds of commodity types, so that the user only needs to obtain the commodities with the highest sale amount and the related commodities with high confidence degree.
The PPFPrefixSpan algorithm is a sequence pattern association algorithm based on prefix projection on the basis of the EIPreFIxSpan algorithm, and is different from the EIPreFIxSpan algorithm in that a prefix position information table is introduced into the PPFPrefixSpan algorithm because frequent sequence pattern sets generated for the same projection database are the same. When the location information of any prefix a is the same as the prefix location information of the prefix β in the prefix location information table, the frequent sequence pattern set generated by the projection database of the prefix a and the new frequent sequence pattern set generated by the prefix β can be directly returned. Therefore, depth-first traversal is adopted, a single item of the initial prefix is taken first, and recursion is carried out from the item with the length of 1 to the item with the length of L to serve as an initial reference set. Because the probability of prefix position repetition between new prefixes recursively generated from the same prefix is much lower than the probability of prefix position repetition between prefixes of the same length through breadth-first traversal, depth-first traversal is employed in the PPFPrefixSpan algorithm.
In the actual arithmetic operation, the prefix position information table is stored through a Hash table, and the Hash table is a data structure which is stored and accessed through a Key-Value mode, and can be directly inquired through a Key Value, so that the inquiry speed is accelerated. The algorithm comprises three important information, namely prefixes, prefix positions and frequent sequences of projection databases corresponding to the prefixes, so that the information is stored through a secondary Hash table. The first level stores a prefix, namely a prefix position, wherein the prefix is a Key Value and the prefix position is Value; and storing prefix positions, namely frequent sequences, in the second level, wherein the prefix positions are Key values, and the frequent sequence of the projection database corresponding to the prefixes is Value.
The algorithm firstly scans the whole sequence database, counts all single items, sorts the support degree of each single item, and takes the first mu item with large support degree as an initial prefix. And adopting depth-first traversal, firstly constructing a projection database of the first initial prefix, and directly storing the prefix position information if a prefix position information table is empty. And performing single item counting on the projection database, forming a new prefix by the single item meeting the minimum support and the original prefix, storing prefix position information of the new prefix, forming a new projection database, iterating the new projection database, wherein the projection database is empty, and storing all sequence mode results. And acquiring prefix position information of a second initial prefix, scanning a prefix position information table, stopping recursion if a prefix with the same position as the initial prefix exists, and directly returning to a sequence mode set generated by the prefix, otherwise, storing the initial prefix position information, continuing recursion until all initial prefixes are calculated, and returning to all sequence mode sets.
The transaction data frequent sequence pattern mining method based on the improved Prefix span algorithm provided by the invention avoids repeated calculation of a repeated projection database by the aid of the prefix position information table and by adopting depth-first traversal, reduces the time/space consumption of transaction data frequent sequence pattern mining, and improves the execution efficiency.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (5)

1. A transaction data frequent sequence pattern mining method based on an improved Prefix span algorithm is characterized by comprising the following steps:
s1) preprocessing the acquired commodity transaction data to obtain a commodity transaction data set, and storing the commodity transaction data set in a transaction sequence database D;
s2) scanning the transaction sequence database D, counting each single item with the length of 1 to obtain the sequence support degree sup of each single item, arranging in a descending order, and selecting the single item with the first mu item and the minimum support degree min _ sup as an initial prefix;
s3), depth-first traversal is adopted, the position of the first initial prefix is calculated and stored in the prefix position information table, and a commodity transaction projection database is generated; iterating the commodity transaction projection database until a new commodity transaction projection database cannot be generated, and storing a frequent sequence mode set generated by each commodity transaction projection database;
s4) repeating the step S3) from the second initial prefix until all initial prefixes are calculated;
wherein, the step S4) specifically includes:
s401) generating a commodity transaction projection database with a second initial prefix; if the commodity transaction projection database is empty, recursively returning;
s402) scanning a commodity transaction projection database, and counting the single items; if the sequence support degree sup of all the single items is lower than the minimum support degree min _ sup, recursively returning;
s403) merging each single item meeting the minimum support degree min _ sup with the current prefix to obtain a plurality of new prefixes, and calculating prefix positions of the new prefixes; if the prefix position information table has the prefix with the same position as the previous prefix, directly returning a frequent sequence mode set generated by the prefix in the prefix position information table, and returning to the step S3); otherwise, the prefix position information table stores new prefix position information, generates new commodity transaction projection data, and returns to step S401).
2. The improved Prefix span algorithm based transaction data frequent sequence pattern mining method as claimed in claim 1, wherein the preprocessing of the obtained commodity transaction data in step S1) includes complementing or deleting missing or repeated order records and correcting data with recording errors.
3. The frequent sequence pattern mining method for transaction data based on the improved Prefix span algorithm as claimed in claim 1, wherein the top μ item in step S2) is a significant item number, which refers to the category of the main goods sold by the vending machine.
4. The frequent sequence pattern mining method of transaction data based on modified Prefix span algorithm as claimed in claim 3, wherein the number of valid items μ is set according to the mechanical structure and container capacity of vending machine and the number of channels, or the number of main commodity kinds set by manager, or the combination of the two.
5. The improved Prefix span algorithm based transaction data frequent sequence pattern mining method as claimed in claim 1, wherein said prefix position information table in step S3) is saved by a Hash table.
CN202110777271.8A 2021-07-09 2021-07-09 Transaction data frequent sequence pattern mining method based on improved Prefix span algorithm Withdrawn CN113505156A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110777271.8A CN113505156A (en) 2021-07-09 2021-07-09 Transaction data frequent sequence pattern mining method based on improved Prefix span algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110777271.8A CN113505156A (en) 2021-07-09 2021-07-09 Transaction data frequent sequence pattern mining method based on improved Prefix span algorithm

Publications (1)

Publication Number Publication Date
CN113505156A true CN113505156A (en) 2021-10-15

Family

ID=78011958

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110777271.8A Withdrawn CN113505156A (en) 2021-07-09 2021-07-09 Transaction data frequent sequence pattern mining method based on improved Prefix span algorithm

Country Status (1)

Country Link
CN (1) CN113505156A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115878421A (en) * 2022-12-09 2023-03-31 国网湖北省电力有限公司信息通信公司 Data center equipment-level fault prediction method, system and medium based on log time sequence correlation characteristic mining

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115878421A (en) * 2022-12-09 2023-03-31 国网湖北省电力有限公司信息通信公司 Data center equipment-level fault prediction method, system and medium based on log time sequence correlation characteristic mining
CN115878421B (en) * 2022-12-09 2023-11-14 国网湖北省电力有限公司信息通信公司 Data center equipment level fault prediction method, system and medium

Similar Documents

Publication Publication Date Title
Hartman The parallel replacement problem with demand and capital budgeting constraints
US5819266A (en) System and method for mining sequential patterns in a large database
US6154739A (en) Method for discovering groups of objects having a selectable property from a population of objects
US7181460B2 (en) User-defined aggregate functions in database systems without native support
US9639596B2 (en) Processing data in a data warehouse
US20130124960A1 (en) Automated suggested summarizations of data
AU7196900A (en) Method and system for researching sales effects of advertising using association analysis
Packianather et al. Data mining techniques applied to a manufacturing SME
CN111460011A (en) Page data display method and device, server and storage medium
CN110807053A (en) Method for finding frequent item set based on improved Apriori algorithm
CN113505156A (en) Transaction data frequent sequence pattern mining method based on improved Prefix span algorithm
Verma et al. Data mining: next generation challenges and futureDirections
Yang et al. Discovery of online shopping patterns across websites
Hilderman et al. Mining association rules from market basket data using share measures and characterized itemsets
US6947878B2 (en) Analysis of retail transactions using gaussian mixture models in a data mining system
Mondal et al. An inventory-aware and revenue-based itemset placement framework for retail stores
Wu et al. Modeling and imputation of large incomplete multidimensional datasets
CN107609110B (en) Mining method and device for maximum multiple frequent patterns based on classification tree
Kain et al. The index selection problem with configurations and memory limitation: A scatter search approach
Tan et al. Finding similar time series in sales transaction data
CN114266914A (en) Abnormal behavior detection method and device
CN114091842A (en) Commodity data quality evaluation method, commodity data replenishment method, commodity data quality evaluation apparatus, and storage medium
CN114328491A (en) Data processing method and device
CN112801793B (en) Method for mining high-profit commodities in e-commerce transaction data
Mukai et al. CONTRIBUTION OF INFORMATION SYSTEMS TO BUSINESS PERFORMANCE AS AN EMBEDDED FACTOR OF" DIFFERENTIATION MECHANISM": A CASE STUDY OF SEVEN-ELEVEN JAPAN (APCIM2009 Best Papers)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20211015