WO2007104611A3 - Input data structure for data mining - Google Patents

Input data structure for data mining Download PDF

Info

Publication number
WO2007104611A3
WO2007104611A3 PCT/EP2007/051025 EP2007051025W WO2007104611A3 WO 2007104611 A3 WO2007104611 A3 WO 2007104611A3 EP 2007051025 W EP2007051025 W EP 2007051025W WO 2007104611 A3 WO2007104611 A3 WO 2007104611A3
Authority
WO
WIPO (PCT)
Prior art keywords
transactions
list
bit field
field information
different
Prior art date
Application number
PCT/EP2007/051025
Other languages
French (fr)
Other versions
WO2007104611A2 (en
Inventor
Toni Bollinger
Ansgar Dorneich
Christoph Lingenfelder
Original Assignee
Ibm
Toni Bollinger
Ansgar Dorneich
Christoph Lingenfelder
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ibm, Toni Bollinger, Ansgar Dorneich, Christoph Lingenfelder filed Critical Ibm
Publication of WO2007104611A2 publication Critical patent/WO2007104611A2/en
Publication of WO2007104611A3 publication Critical patent/WO2007104611A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Abstract

A computer data structure for compressing data comprised in a set of transactions, each transaction having at least one item, contains a list of identifiers of different items in the set of transactions, information indicating number of identifiers in the list, and bit field information indicating presence of the different items in the set of transactions. The bit field information is organized in accordance with the list for facilitating evaluation of patterns with respect to the set of transactions. When compressing data comprised in a plurality of transactions, a unique identifier is assigned to each different item and, if taxonomy is defined, to each different taxonomy parent. Sets of transactions are formed and then stored using the defined computer data structures. When detecting patterns in input data, a candidate pattern is evaluated using bit map operations on the bit field information of the computer data structures.
PCT/EP2007/051025 2006-03-14 2007-02-02 Input data structure for data mining WO2007104611A2 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP06111140 2006-03-14
EP06111140.7 2006-03-14
EP06121742 2006-10-04
EP06121742.8 2006-10-04

Publications (2)

Publication Number Publication Date
WO2007104611A2 WO2007104611A2 (en) 2007-09-20
WO2007104611A3 true WO2007104611A3 (en) 2008-01-03

Family

ID=37903473

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2007/051025 WO2007104611A2 (en) 2006-03-14 2007-02-02 Input data structure for data mining

Country Status (2)

Country Link
US (1) US8250105B2 (en)
WO (1) WO2007104611A2 (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102006020176A1 (en) * 2006-05-02 2007-11-08 Robert Bosch Gmbh Method for supply of forecasting pattern for object out of number of objects, involves providing forecasting pattern in consideration of usage range of these associated objects, and preset seasonal coefficients are standardized
US9128954B2 (en) * 2007-05-09 2015-09-08 Illinois Institute Of Technology Hierarchical structured data organization system
US9818118B2 (en) * 2008-11-19 2017-11-14 Visa International Service Association Transaction aggregator
US8626705B2 (en) * 2009-11-05 2014-01-07 Visa International Service Association Transaction aggregator for closed processing
US9805314B2 (en) * 2010-08-26 2017-10-31 Red Hat, Inc. Storing a business process state
JP5601724B2 (en) * 2011-11-25 2014-10-08 楽天株式会社 Information processing apparatus, information processing method, information processing program, and recording medium on which information processing program is recorded
US9171158B2 (en) 2011-12-12 2015-10-27 International Business Machines Corporation Dynamic anomaly, association and clustering detection
KR102074734B1 (en) * 2013-02-28 2020-03-02 삼성전자주식회사 Method and apparatus for pattern discoverty in sequence data
US10061822B2 (en) * 2013-07-26 2018-08-28 Genesys Telecommunications Laboratories, Inc. System and method for discovering and exploring concepts and root causes of events
US9971764B2 (en) 2013-07-26 2018-05-15 Genesys Telecommunications Laboratories, Inc. System and method for discovering and exploring concepts
US20150081728A1 (en) * 2013-09-17 2015-03-19 Alon Rosenberg Automatic format conversion
CN107103471B (en) * 2017-03-28 2020-06-30 上海瑞麒维网络科技有限公司 Method and device for determining transaction validity based on block chain
US11010387B2 (en) 2017-10-06 2021-05-18 Microsoft Technology Licensing, Llc Join operation and interface for wildcards
US11520804B1 (en) 2021-05-13 2022-12-06 International Business Machines Corporation Association rule mining
US11762867B2 (en) * 2021-10-07 2023-09-19 International Business Machines Corporation Association rule mining using max pattern transactions

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001134575A (en) * 1999-10-29 2001-05-18 Internatl Business Mach Corp <Ibm> Method and system for detecting frequently appearing pattern
US6804664B1 (en) * 2000-10-10 2004-10-12 Netzero, Inc. Encoded-data database for fast queries
US7024414B2 (en) * 2001-08-06 2006-04-04 Sensage, Inc. Storage of row-column data
AU2003253196A1 (en) * 2002-07-26 2004-02-23 Ron Everett Data management architecture associating generic data items using reference
US7526461B2 (en) * 2004-11-17 2009-04-28 Gm Global Technology Operations, Inc. System and method for temporal data mining
US7630996B1 (en) * 2005-02-02 2009-12-08 Hywire Ltd. Method of database compression for database entries having a pre-determined common part
US7548928B1 (en) * 2005-08-05 2009-06-16 Google Inc. Data compression of large scale data stored in sparse tables

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MOHAMMED J. ZAKI ET AL.: "Fast vertical mining using diffsets", PROCEEDINGS OF THE NINTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2003, Washington, D.C., US, pages 326 - 335, XP002431508 *
PRADEEP SHENOY ET AL.: "Turbo-charging vertical mining of large databases", PROCEEDINGS OF THE 2000 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2000, Dallas, Texas, United States, pages 22 - 33, XP002431507 *
SRIKANT R ET AL: "MINING GENERALIZED ASSOCIATION RULES", PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, 11 September 1995 (1995-09-11), pages 407 - 419, XP000671664 *
YIN-FU HUANG ET AL: "Mining generalized association rules using pruning techniques", DATA MINING, 2002. PROCEEDINGS. 2002 IEEE INTERNATIONAL CONFERENCE ON MAEBASHI CITY, JAPAN 9-12 DEC. 2002, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 9 December 2002 (2002-12-09), pages 227 - 234, XP010805120, ISBN: 0-7695-1754-4 *

Also Published As

Publication number Publication date
WO2007104611A2 (en) 2007-09-20
US8250105B2 (en) 2012-08-21
US20070220030A1 (en) 2007-09-20

Similar Documents

Publication Publication Date Title
WO2007104611A3 (en) Input data structure for data mining
Rigby et al. Localization--the revolution in consumer markets.
IL193182A0 (en) Data mining by determining patterns in input data
WO2009140347A3 (en) Systems and methods for assessing locations of multiple touch inputs
BRPI0511756A8 (en) RANDOM ACCESS RECORDING METHOD AND SYSTEM FOR RECORDING AND DISPLAYING AVIONICS DATA AND COMPUTER READABLE MEMORY
JP2009508221A5 (en)
D’Ambrosio et al. A differential evolution algorithm for finding the median ranking under the Kemeny axiomatic approach
WO2007020423A3 (en) Mutual-rank similarity-space for navigating, visualising and clustering in image databases
CA2621581A1 (en) A method, device, computer program and graphical user interface for user input of an electronic device
WO2007055821A3 (en) Defining ontologies and word disambiguation
WO2008039542A3 (en) System and method of ad-hoc analysis of data
SE0103361D0 (en) Object oriented data processing
CN104572627A (en) Object name editing distance calculating method and object name editing distance matching method based on information entropy
CN103678451A (en) Method and system for spreadsheet schema extraction
CN103618744A (en) Intrusion detection method based on fast k-nearest neighbor (KNN) algorithm
Baralis et al. I‐prune: Item selection for associative classification
WO2005003308A3 (en) Biological data set comparison method
CN101405725A (en) Information retrieval device by means of ambiguous word and program
WO2008015395A3 (en) Storage and processing of spreadsheets and other documents
CN103218452A (en) Method and device for recognizing valid interlinkage in Hub webpage
WO2009020790A3 (en) Website exchange of personal information keyed to easily remembered non-alphanumeric symbols
WO2007127096A3 (en) Data requirements methodology
WO2006072855A3 (en) Card with input elements for entering a pin code and method of entering a pin code
CN102968435A (en) Method for establishing information category system and corresponding information classification browsing and searching device
WO2010027899A3 (en) Method, computer program product, and apparatus for enabling access to enterprise information

Legal Events

Date Code Title Description
DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07726302

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 07726302

Country of ref document: EP

Kind code of ref document: A2