HK1093568A1 - Data profiling - Google Patents

Data profiling

Info

Publication number
HK1093568A1
HK1093568A1 HK06114200.1A HK06114200A HK1093568A1 HK 1093568 A1 HK1093568 A1 HK 1093568A1 HK 06114200 A HK06114200 A HK 06114200A HK 1093568 A1 HK1093568 A1 HK 1093568A1
Authority
HK
Hong Kong
Prior art keywords
data
subsets
fields
field
identifying
Prior art date
Application number
HK06114200.1A
Other languages
English (en)
Inventor
Joel Gould
Carl Feynman
Paul Bay
Original Assignee
Initio Technology Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Initio Technology Llc filed Critical Initio Technology Llc
Publication of HK1093568A1 publication Critical patent/HK1093568A1/xx

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24542Plan optimisation
    • G06F16/24544Join order optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • G06F16/24556Aggregation; Duplicate elimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/40Data acquisition and logging

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Operations Research (AREA)
  • Computer Hardware Design (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Numerical Control (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)
  • Television Systems (AREA)
  • Holo Graphy (AREA)
  • Crystals, And After-Treatments Of Crystals (AREA)
  • Optical Communication System (AREA)
HK06114200.1A 2003-09-15 2006-12-28 Data profiling HK1093568A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US50290803P 2003-09-15 2003-09-15
US51303803P 2003-10-20 2003-10-20
US53295603P 2003-12-22 2003-12-22
PCT/US2004/030144 WO2005029369A2 (en) 2003-09-15 2004-09-15 Data profiling

Publications (1)

Publication Number Publication Date
HK1093568A1 true HK1093568A1 (en) 2007-03-02

Family

ID=34381971

Family Applications (1)

Application Number Title Priority Date Filing Date
HK06114200.1A HK1093568A1 (en) 2003-09-15 2006-12-28 Data profiling

Country Status (10)

Country Link
US (5) US7756873B2 (xx)
EP (3) EP1676217B1 (xx)
JP (3) JP5328099B2 (xx)
KR (4) KR20090039803A (xx)
CN (1) CN102982065B (xx)
AT (1) ATE515746T1 (xx)
AU (3) AU2004275334B9 (xx)
CA (3) CA2538568C (xx)
HK (1) HK1093568A1 (xx)
WO (1) WO2005029369A2 (xx)

Families Citing this family (208)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2004275334B9 (en) * 2003-09-15 2011-06-16 Ab Initio Technology Llc. Data Profiling
US7653641B2 (en) * 2004-05-04 2010-01-26 Accruent, Inc. Abstraction control solution
US7349898B2 (en) * 2004-06-04 2008-03-25 Oracle International Corporation Approximate and exact unary inclusion dependency discovery
US7647293B2 (en) * 2004-06-10 2010-01-12 International Business Machines Corporation Detecting correlation from data
US7386566B2 (en) * 2004-07-15 2008-06-10 Microsoft Corporation External metadata processing
US8732004B1 (en) 2004-09-22 2014-05-20 Experian Information Solutions, Inc. Automated analysis of data to generate prospect notifications based on trigger events
US7852342B2 (en) 2004-10-14 2010-12-14 Microsoft Corporation Remote client graphics rendering
US20060082581A1 (en) 2004-10-14 2006-04-20 Microsoft Corporation Encoding for remoting graphics to decoder device
US7610264B2 (en) * 2005-02-28 2009-10-27 International Business Machines Corporation Method and system for providing a learning optimizer for federated database systems
CN102004950A (zh) * 2005-04-25 2011-04-06 因文西斯系统公司 在工业过程控制环境中记录和跟踪非趋势生产数据和事件
US7836104B2 (en) * 2005-06-03 2010-11-16 Sap Ag Demonstration tool for a business information enterprise system
US7877350B2 (en) * 2005-06-27 2011-01-25 Ab Initio Technology Llc Managing metadata for graph-based computations
US20070006070A1 (en) * 2005-06-30 2007-01-04 International Business Machines Corporation Joining units of work based on complexity metrics
US8788464B1 (en) * 2005-07-25 2014-07-22 Lockheed Martin Corporation Fast ingest, archive and retrieval systems, method and computer programs
US20070033198A1 (en) * 2005-08-02 2007-02-08 Defries Anthony Data representation architecture for media access
US8527563B2 (en) 2005-09-12 2013-09-03 Microsoft Corporation Remoting redirection layer for graphics device interface
US20070073721A1 (en) * 2005-09-23 2007-03-29 Business Objects, S.A. Apparatus and method for serviced data profiling operations
US20070074176A1 (en) * 2005-09-23 2007-03-29 Business Objects, S.A. Apparatus and method for parallel processing of data profiling information
US8996586B2 (en) * 2006-02-16 2015-03-31 Callplex, Inc. Virtual storage of portable media files
US7873628B2 (en) * 2006-03-23 2011-01-18 Oracle International Corporation Discovering functional dependencies by sampling relations
US20070271259A1 (en) * 2006-05-17 2007-11-22 It Interactive Services Inc. System and method for geographically focused crawling
US7526486B2 (en) * 2006-05-22 2009-04-28 Initiate Systems, Inc. Method and system for indexing information about entities with respect to hierarchies
AU2007254820B2 (en) 2006-06-02 2012-04-05 International Business Machines Corporation Automatic weight generation for probabilistic matching
US7711736B2 (en) * 2006-06-21 2010-05-04 Microsoft International Holdings B.V. Detection of attributes in unstructured data
US7698268B1 (en) 2006-09-15 2010-04-13 Initiate Systems, Inc. Method and system for filtering false positives
US8356009B2 (en) 2006-09-15 2013-01-15 International Business Machines Corporation Implementation defined segments for relational database systems
US7685093B1 (en) 2006-09-15 2010-03-23 Initiate Systems, Inc. Method and system for comparing attributes such as business names
US8700579B2 (en) * 2006-09-18 2014-04-15 Infobright Inc. Method and system for data compression in a relational database
US8266147B2 (en) * 2006-09-18 2012-09-11 Infobright, Inc. Methods and systems for database organization
US8762834B2 (en) * 2006-09-29 2014-06-24 Altova, Gmbh User interface for defining a text file transformation
US9846739B2 (en) 2006-10-23 2017-12-19 Fotonation Limited Fast database matching
US7809747B2 (en) * 2006-10-23 2010-10-05 Donald Martin Monro Fuzzy database matching
US20080097992A1 (en) * 2006-10-23 2008-04-24 Donald Martin Monro Fast database matching
US7774329B1 (en) 2006-12-22 2010-08-10 Amazon Technologies, Inc. Cross-region data access in partitioned framework
US8150870B1 (en) 2006-12-22 2012-04-03 Amazon Technologies, Inc. Scalable partitioning in a multilayered data service framework
US7613707B1 (en) * 2006-12-22 2009-11-03 Amazon Technologies, Inc. Traffic migration in a multilayered data service framework
CN101226523B (zh) * 2007-01-17 2012-09-05 国际商业机器公司 数据概况分析方法和系统
US8359339B2 (en) 2007-02-05 2013-01-22 International Business Machines Corporation Graphical user interface for configuration of an algorithm for the matching of data records
US20080195575A1 (en) * 2007-02-12 2008-08-14 Andreas Schiffler Electronic data display management system and method
US8515926B2 (en) * 2007-03-22 2013-08-20 International Business Machines Corporation Processing related data from information sources
US8423514B2 (en) 2007-03-29 2013-04-16 International Business Machines Corporation Service provisioning
WO2008121700A1 (en) 2007-03-29 2008-10-09 Initiate Systems, Inc. Method and system for managing entities
WO2008121170A1 (en) 2007-03-29 2008-10-09 Initiate Systems, Inc. Method and system for parsing languages
WO2008121824A1 (en) 2007-03-29 2008-10-09 Initiate Systems, Inc. Method and system for data exchange among data sources
US20120164613A1 (en) * 2007-11-07 2012-06-28 Jung Edward K Y Determining a demographic characteristic based on computational user-health testing of a user interaction with advertiser-specified content
US8069129B2 (en) 2007-04-10 2011-11-29 Ab Initio Technology Llc Editing and compiling business rules
US20090254588A1 (en) * 2007-06-19 2009-10-08 Zhong Li Multi-Dimensional Data Merge
US20110010214A1 (en) * 2007-06-29 2011-01-13 Carruth J Scott Method and system for project management
CN101689089B (zh) 2007-07-12 2012-05-23 爱特梅尔公司 二维触摸面板
US20090055828A1 (en) * 2007-08-22 2009-02-26 Mclaren Iain Douglas Profile engine system and method
AU2008302144B2 (en) * 2007-09-20 2014-09-11 Ab Initio Technology Llc Managing data flows in graph-based computations
US9690820B1 (en) 2007-09-27 2017-06-27 Experian Information Solutions, Inc. Database system for triggering event notifications based on updates to database records
US8713434B2 (en) 2007-09-28 2014-04-29 International Business Machines Corporation Indexing, relating and managing information about entities
CA2701043C (en) 2007-09-28 2016-10-11 Initiate Systems, Inc. Method and system for associating data records in multiple languages
EP2193415A4 (en) * 2007-09-28 2013-08-28 Ibm METHOD AND SYSTEM FOR ANALYZING A SYSTEM FOR THE ADJUSTMENT OF DATA SETS
US8321914B2 (en) 2008-01-21 2012-11-27 International Business Machines Corporation System and method for verifying an attribute in records for procurement application
US8224797B2 (en) * 2008-03-04 2012-07-17 International Business Machines Corporation System and method for validating data record
US8046385B2 (en) * 2008-06-20 2011-10-25 Ab Initio Technology Llc Data quality tracking
AU2009267034B2 (en) 2008-06-30 2015-12-10 Ab Initio Technology Llc Data logging in graph-based computations
US8239389B2 (en) * 2008-09-29 2012-08-07 International Business Machines Corporation Persisting external index data in a database
EP2342684B1 (en) 2008-10-23 2024-05-29 Ab Initio Technology LLC Fuzzy data operations
JP5525541B2 (ja) 2008-12-02 2014-06-18 アビニシオ テクノロジー エルエルシー データ管理システム内のデータセットのインスタンスのマッピング
US20100174638A1 (en) 2009-01-06 2010-07-08 ConsumerInfo.com Report existence monitoring
CA2749538A1 (en) * 2009-01-30 2010-08-05 Ab Initio Technology Llc Processing data using vector fields
KR101693229B1 (ko) * 2009-02-13 2017-01-05 아브 이니티오 테크놀로지 엘엘시 데이터 저장 시스템과의 통신
US8051060B1 (en) * 2009-02-13 2011-11-01 At&T Intellectual Property I, L.P. Automatic detection of separators for compression
US9886319B2 (en) 2009-02-13 2018-02-06 Ab Initio Technology Llc Task managing application for performing tasks based on messages received from a data processing application initiated by the task managing application
US10102398B2 (en) * 2009-06-01 2018-10-16 Ab Initio Technology Llc Generating obfuscated data
KR20150040384A (ko) * 2009-06-10 2015-04-14 아브 이니티오 테크놀로지 엘엘시 테스트 데이터의 생성
JP2011008560A (ja) * 2009-06-26 2011-01-13 Hitachi Ltd 情報管理システム
US8205113B2 (en) * 2009-07-14 2012-06-19 Ab Initio Technology Llc Fault tolerant batch processing
CA2771899C (en) * 2009-09-16 2017-08-01 Ab Initio Technology Llc Mapping dataset elements
JP5411282B2 (ja) * 2009-09-17 2014-02-12 パナソニック株式会社 情報処理装置、管理装置、不正モジュール検知システム、不正モジュール検知方法、不正モジュール検知プログラムを記録している記録媒体、管理方法、管理プログラムを記録している記録媒体および集積回路
US8700577B2 (en) * 2009-12-07 2014-04-15 Accenture Global Services Limited GmbH Method and system for accelerated data quality enhancement
WO2011081776A1 (en) * 2009-12-14 2011-07-07 Ab Initio Technology Llc Specifying user interface elements
US9477369B2 (en) * 2010-03-08 2016-10-25 Salesforce.Com, Inc. System, method and computer program product for displaying a record as part of a selected grouping of data
US8205114B2 (en) * 2010-04-07 2012-06-19 Verizon Patent And Licensing Inc. Method and system for partitioning data files for efficient processing
US8577094B2 (en) 2010-04-09 2013-11-05 Donald Martin Monro Image template masking
US8417727B2 (en) 2010-06-14 2013-04-09 Infobright Inc. System and method for storing data in a relational database
US8521748B2 (en) 2010-06-14 2013-08-27 Infobright Inc. System and method for managing metadata in a relational database
US8875145B2 (en) 2010-06-15 2014-10-28 Ab Initio Technology Llc Dynamically loading graph-based computations
JP5826260B2 (ja) 2010-06-22 2015-12-02 アビニシオ テクノロジー エルエルシー 関連データセットの処理
US8990165B2 (en) * 2010-07-13 2015-03-24 Hewlett-Packard Development Company, L.P. Methods, apparatus and articles of manufacture to archive data
US8515863B1 (en) * 2010-09-01 2013-08-20 Federal Home Loan Mortgage Corporation Systems and methods for measuring data quality over time
CA2814835C (en) 2010-10-25 2019-01-08 Ab Initio Technology Llc Managing data set objects in a dataflow graph that represents a computer program
KR20120061308A (ko) * 2010-12-03 2012-06-13 삼성전자주식회사 휴대용 단말기에서 데이터 베이스를 제어하기 위한 장치 및 방법
AU2012205339B2 (en) 2011-01-14 2015-12-03 Ab Initio Technology Llc Managing changes to collections of data
WO2012103438A1 (en) * 2011-01-28 2012-08-02 Ab Initio Technology Llc Generating data pattern information
US9021299B2 (en) 2011-02-18 2015-04-28 Ab Initio Technology Llc Restarting processes
US9116759B2 (en) 2011-02-18 2015-08-25 Ab Initio Technology Llc Restarting data processing systems
US9311487B2 (en) * 2011-03-15 2016-04-12 Panasonic Corporation Tampering monitoring system, management device, protection control module, and detection module
US9558519B1 (en) 2011-04-29 2017-01-31 Consumerinfo.Com, Inc. Exposing reporting cycle information
US20120330880A1 (en) * 2011-06-23 2012-12-27 Microsoft Corporation Synthetic data generation
US8782016B2 (en) * 2011-08-26 2014-07-15 Qatar Foundation Database record repair
US9116934B2 (en) * 2011-08-26 2015-08-25 Qatar Foundation Holistic database record repair
US8863082B2 (en) * 2011-09-07 2014-10-14 Microsoft Corporation Transformational context-aware data source management
US8719271B2 (en) 2011-10-06 2014-05-06 International Business Machines Corporation Accelerating data profiling process
US9438656B2 (en) 2012-01-11 2016-09-06 International Business Machines Corporation Triggering window conditions by streaming features of an operator graph
US9430117B2 (en) * 2012-01-11 2016-08-30 International Business Machines Corporation Triggering window conditions using exception handling
US20130304712A1 (en) * 2012-05-11 2013-11-14 Theplatform For Media, Inc. System and method for validation
US9582553B2 (en) * 2012-06-26 2017-02-28 Sap Se Systems and methods for analyzing existing data models
US9633076B1 (en) * 2012-10-15 2017-04-25 Tableau Software Inc. Blending and visualizing data from multiple data sources
US10489360B2 (en) * 2012-10-17 2019-11-26 Ab Initio Technology Llc Specifying and applying rules to data
WO2014065919A1 (en) 2012-10-22 2014-05-01 Ab Initio Technology Llc Profiling data with location information
KR102113366B1 (ko) * 2012-10-22 2020-05-20 아브 이니티오 테크놀로지 엘엘시 데이터 저장 시스템에서 데이터 소스 특성화
US9507682B2 (en) 2012-11-16 2016-11-29 Ab Initio Technology Llc Dynamic graph performance monitoring
US10108521B2 (en) 2012-11-16 2018-10-23 Ab Initio Technology Llc Dynamic component performance monitoring
US9703822B2 (en) 2012-12-10 2017-07-11 Ab Initio Technology Llc System for transform generation
EP2757467A1 (en) * 2013-01-22 2014-07-23 Siemens Aktiengesellschaft Management apparatus and method for managing data elements of a version control system
US9892026B2 (en) * 2013-02-01 2018-02-13 Ab Initio Technology Llc Data records selection
US9135280B2 (en) * 2013-02-11 2015-09-15 Oracle International Corporation Grouping interdependent fields
US9471545B2 (en) 2013-02-11 2016-10-18 Oracle International Corporation Approximating value densities
US9110949B2 (en) 2013-02-11 2015-08-18 Oracle International Corporation Generating estimates for query optimization
US9811233B2 (en) 2013-02-12 2017-11-07 Ab Initio Technology Llc Building applications for configuring processes
US10332010B2 (en) 2013-02-19 2019-06-25 Business Objects Software Ltd. System and method for automatically suggesting rules for data stored in a table
US9576036B2 (en) 2013-03-15 2017-02-21 International Business Machines Corporation Self-analyzing data processing job to determine data quality issues
KR101444249B1 (ko) * 2013-05-13 2014-09-26 (주) 아트리아트레이딩 대차 거래, 공매도 거래 또는 주식 스왑 거래에 관한 정보를 제공하기 위한 방법, 시스템 및 비일시성의 컴퓨터 판독 가능 기록 매체
JP6387399B2 (ja) * 2013-05-17 2018-09-05 アビニシオ テクノロジー エルエルシー データ操作のための、メモリ及びストレージ空間の管理
US20150032907A1 (en) * 2013-07-26 2015-01-29 Alcatel-Lucent Canada, Inc. Universal adapter with context-bound translation for application adaptation layer
WO2015027085A1 (en) 2013-08-22 2015-02-26 Genomoncology, Llc Computer-based systems and methods for analyzing genomes based on discrete data structures corresponding to genetic variants therein
KR102349573B1 (ko) 2013-09-27 2022-01-10 아브 이니티오 테크놀로지 엘엘시 데이터에 적용되는 규칙 평가
US20150120224A1 (en) * 2013-10-29 2015-04-30 C3 Energy, Inc. Systems and methods for processing data relating to energy usage
CA3114544A1 (en) 2013-12-05 2015-06-11 Ab Initio Technology Llc Managing interfaces for dataflow composed of sub-graphs
AU2014364882B2 (en) 2013-12-18 2020-02-06 Ab Initio Technology Llc Data generation
US9529849B2 (en) 2013-12-31 2016-12-27 Sybase, Inc. Online hash based optimizer statistics gathering in a database
US11487732B2 (en) * 2014-01-16 2022-11-01 Ab Initio Technology Llc Database key identification
US9984173B2 (en) * 2014-02-24 2018-05-29 International Business Machines Corporation Automated value analysis in legacy data
EP3594821B1 (en) * 2014-03-07 2023-08-16 AB Initio Technology LLC Managing data profiling operations related to data type
US9323809B2 (en) 2014-03-10 2016-04-26 Interana, Inc. System and methods for rapid data analysis
US9846567B2 (en) 2014-06-16 2017-12-19 International Business Machines Corporation Flash optimized columnar data layout and data access algorithms for big data query engines
US9633058B2 (en) 2014-06-16 2017-04-25 International Business Machines Corporation Predictive placement of columns during creation of a large database
WO2016011441A1 (en) 2014-07-18 2016-01-21 Ab Initio Technology Llc Managing parameter sets
KR102361154B1 (ko) * 2014-09-02 2022-02-09 아브 이니티오 테크놀로지 엘엘시 사용자 상호작용을 통한 그래프 기반 프로그램에서 구성요소 서브세트의 시각적 명시
US9626393B2 (en) 2014-09-10 2017-04-18 Ab Initio Technology Llc Conditional validation rules
US10055333B2 (en) 2014-11-05 2018-08-21 Ab Initio Technology Llc Debugging a graph
US9880818B2 (en) * 2014-11-05 2018-01-30 Ab Initio Technology Llc Application testing
US10296507B2 (en) 2015-02-12 2019-05-21 Interana, Inc. Methods for enhancing rapid data analysis
US9952808B2 (en) 2015-03-26 2018-04-24 International Business Machines Corporation File system block-level tiering and co-allocation
CN104850590A (zh) * 2015-04-24 2015-08-19 百度在线网络技术(北京)有限公司 一种生成结构化数据的元数据的方法与装置
US11068647B2 (en) * 2015-05-28 2021-07-20 International Business Machines Corporation Measuring transitions between visualizations
KR101632073B1 (ko) * 2015-06-04 2016-06-20 장원중 통계 분석 기반의 데이터 프로파일링을 제공하기 위한 방법, 시스템 및 비일시성의 컴퓨터 판독 가능한 기록 매체
EP3278213A4 (en) 2015-06-05 2019-01-30 C3 IoT, Inc. SYSTEMS, METHODS AND DEVICES FOR AN APPLICATION DEVELOPMENT PLATFORM OF AN INTERNET OF THE THINGS OF A COMPANY
US9384203B1 (en) * 2015-06-09 2016-07-05 Palantir Technologies Inc. Systems and methods for indexing and aggregating data records
US10409802B2 (en) * 2015-06-12 2019-09-10 Ab Initio Technology Llc Data quality analysis
US10241979B2 (en) * 2015-07-21 2019-03-26 Oracle International Corporation Accelerated detection of matching patterns
US10657134B2 (en) 2015-08-05 2020-05-19 Ab Initio Technology Llc Selecting queries for execution on a stream of real-time data
US10127264B1 (en) 2015-09-17 2018-11-13 Ab Initio Technology Llc Techniques for automated data analysis
US10607139B2 (en) 2015-09-23 2020-03-31 International Business Machines Corporation Candidate visualization techniques for use with genetic algorithms
US10140337B2 (en) * 2015-10-30 2018-11-27 Sap Se Fuzzy join key
CN108351898B (zh) * 2015-10-30 2021-10-08 安客诚公司 用于结构化多字段文件布局的自动化解释
US11410230B1 (en) 2015-11-17 2022-08-09 Consumerinfo.Com, Inc. Realtime access and control of secure regulated data
US10757154B1 (en) 2015-11-24 2020-08-25 Experian Information Solutions, Inc. Real-time event-based notification system
WO2017145386A1 (ja) * 2016-02-26 2017-08-31 株式会社日立製作所 時系列データと分析データとのうちの少なくとも一部を入力データとした分析処理を実行する分析システム及び分析方法
US10685035B2 (en) 2016-06-30 2020-06-16 International Business Machines Corporation Determining a collection of data visualizations
US10423387B2 (en) 2016-08-23 2019-09-24 Interana, Inc. Methods for highly efficient data sharding
US10146835B2 (en) * 2016-08-23 2018-12-04 Interana, Inc. Methods for stratified sampling-based query execution
US11604795B2 (en) 2016-09-26 2023-03-14 Splunk Inc. Distributing partial results from an external data system between worker nodes
US11620336B1 (en) 2016-09-26 2023-04-04 Splunk Inc. Managing and storing buckets to a remote shared storage system based on a collective bucket size
US10353965B2 (en) 2016-09-26 2019-07-16 Splunk Inc. Data fabric service system architecture
US11860940B1 (en) 2016-09-26 2024-01-02 Splunk Inc. Identifying buckets for query execution using a catalog of buckets
US10956415B2 (en) 2016-09-26 2021-03-23 Splunk Inc. Generating a subquery for an external data system using a configuration file
US11093703B2 (en) * 2016-09-29 2021-08-17 Google Llc Generating charts from data in a data table
US9720961B1 (en) 2016-09-30 2017-08-01 Semmle Limited Algebraic data types for database query languages
US9633078B1 (en) * 2016-09-30 2017-04-25 Semmle Limited Generating identifiers for tuples of recursively defined relations
US11741091B2 (en) 2016-12-01 2023-08-29 Ab Initio Technology Llc Generating, accessing, and displaying lineage metadata
US10650050B2 (en) 2016-12-06 2020-05-12 Microsoft Technology Licensing, Llc Synthesizing mapping relationships using table corpus
US10936555B2 (en) * 2016-12-22 2021-03-02 Sap Se Automated query compliance analysis
US10565173B2 (en) * 2017-02-10 2020-02-18 Wipro Limited Method and system for assessing quality of incremental heterogeneous data
US10002146B1 (en) * 2017-02-13 2018-06-19 Sas Institute Inc. Distributed data set indexing
US10514993B2 (en) * 2017-02-14 2019-12-24 Google Llc Analyzing large-scale data processing jobs
CN107220283B (zh) * 2017-04-21 2019-11-08 东软集团股份有限公司 数据处理方法、装置、存储介质及电子设备
US9934287B1 (en) * 2017-07-25 2018-04-03 Capital One Services, Llc Systems and methods for expedited large file processing
US11921672B2 (en) 2017-07-31 2024-03-05 Splunk Inc. Query execution at a remote heterogeneous data store of a data fabric service
US11989194B2 (en) * 2017-07-31 2024-05-21 Splunk Inc. Addressing memory limits for partition tracking among worker nodes
US20200050612A1 (en) * 2017-07-31 2020-02-13 Splunk Inc. Supporting additional query languages through distributed execution of query engines
US11423083B2 (en) 2017-10-27 2022-08-23 Ab Initio Technology Llc Transforming a specification into a persistent computer program
US11055074B2 (en) * 2017-11-13 2021-07-06 Ab Initio Technology Llc Key-based logging for processing of structured data items with executable logic
US11509540B2 (en) * 2017-12-14 2022-11-22 Extreme Networks, Inc. Systems and methods for zero-footprint large-scale user-entity behavior modeling
US11068540B2 (en) 2018-01-25 2021-07-20 Ab Initio Technology Llc Techniques for integrating validation results in data profiling and related systems and methods
SG11202007022WA (en) * 2018-01-25 2020-08-28 Ab Initio Technology Llc Techniques for integrating validation results in data profiling and related systems and methods
US11334543B1 (en) 2018-04-30 2022-05-17 Splunk Inc. Scalable bucket merging for a data intake and query system
EP3575980A3 (en) 2018-05-29 2020-03-04 Accenture Global Solutions Limited Intelligent data quality
CA3106682A1 (en) * 2018-07-19 2020-01-23 Ab Initio Technology Llc Publishing to a data warehouse
US11080266B2 (en) * 2018-07-30 2021-08-03 Futurewei Technologies, Inc. Graph functional dependency checking
US20200074100A1 (en) 2018-09-05 2020-03-05 Consumerinfo.Com, Inc. Estimating changes to user risk indicators based on modeling of similarly categorized users
US11227065B2 (en) 2018-11-06 2022-01-18 Microsoft Technology Licensing, Llc Static data masking
US11423009B2 (en) * 2019-05-29 2022-08-23 ThinkData Works, Inc. System and method to prevent formation of dark data
US11704494B2 (en) * 2019-05-31 2023-07-18 Ab Initio Technology Llc Discovering a semantic meaning of data fields from profile data of the data fields
US11153400B1 (en) 2019-06-04 2021-10-19 Thomas Layne Bascom Federation broker system and method for coordinating discovery, interoperability, connections and correspondence among networked resources
US11494380B2 (en) 2019-10-18 2022-11-08 Splunk Inc. Management of distributed computing framework components in a data fabric service system
CN111143433A (zh) * 2019-12-10 2020-05-12 中国平安财产保险股份有限公司 一种统计数据仓数据的方法及装置
FR3105844A1 (fr) * 2019-12-31 2021-07-02 Bull Sas PROCEDE ET système D’IDENTIFICATION DE VARIABLES PERTINENTES
KR102365910B1 (ko) * 2019-12-31 2022-02-22 가톨릭관동대학교산학협력단 속성 값 품질 지수를 이용한 데이터 프로파일링 방법 및 데이터 프로파일링 시스템
US11200215B2 (en) * 2020-01-30 2021-12-14 International Business Machines Corporation Data quality evaluation
US11922222B1 (en) 2020-01-30 2024-03-05 Splunk Inc. Generating a modified component for a data intake and query system using an isolated execution environment image
US11321340B1 (en) 2020-03-31 2022-05-03 Wells Fargo Bank, N.A. Metadata extraction from big data sources
US11556563B2 (en) * 2020-06-12 2023-01-17 Oracle International Corporation Data stream processing
US11403268B2 (en) * 2020-08-06 2022-08-02 Sap Se Predicting types of records based on amount values of records
US11704313B1 (en) 2020-10-19 2023-07-18 Splunk Inc. Parallel branch operation using intermediary nodes
KR102265937B1 (ko) * 2020-12-21 2021-06-17 주식회사 모비젠 시퀀스데이터의 분석 방법 및 그 장치
US11847390B2 (en) 2021-01-05 2023-12-19 Capital One Services, Llc Generation of synthetic data using agent-based simulations
US20220215243A1 (en) * 2021-01-05 2022-07-07 Capital One Services, Llc Risk-Reliability Framework for Evaluating Synthetic Data Models
EP4285238A1 (en) 2021-01-31 2023-12-06 Ab Initio Technology LLC Data processing system with manipulation of logical dataset groups
US11537594B2 (en) 2021-02-05 2022-12-27 Oracle International Corporation Approximate estimation of number of distinct keys in a multiset using a sample
CN112925792B (zh) * 2021-03-26 2024-01-05 北京中经惠众科技有限公司 数据存储控制方法、装置、计算设备及介质
CN113656430B (zh) * 2021-08-12 2024-02-27 上海二三四五网络科技有限公司 一种批量表数据自动扩充的控制方法及装置
KR102437098B1 (ko) 2022-04-15 2022-08-25 이찬영 인공 지능 기반의 오류 데이터 판정 방법 및 그 장치
US11907051B1 (en) 2022-09-07 2024-02-20 International Business Machines Corporation Correcting invalid zero value for data monitoring

Family Cites Families (73)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2760794B2 (ja) * 1988-01-29 1998-06-04 株式会社日立製作所 データベース処理方法および装置
US5179643A (en) 1988-12-23 1993-01-12 Hitachi, Ltd. Method of multi-dimensional analysis and display for a large volume of record information items and a system therefor
JPH032938A (ja) 1989-05-31 1991-01-09 Hitachi Ltd データベース処理方法
JPH04152440A (ja) * 1990-10-17 1992-05-26 Hitachi Ltd 知的問合せ処理方法
FR2698465B1 (fr) 1992-11-20 1995-01-13 Bull Sa Méthode d'extraction de profils de statistiques, utilisation des statistiques créées par la méthode.
US5742806A (en) * 1994-01-31 1998-04-21 Sun Microsystems, Inc. Apparatus and method for decomposing database queries for database management system including multiprocessor digital data processing system
JP3519126B2 (ja) 1994-07-14 2004-04-12 株式会社リコー 自動レイアウトシステム
US5842200A (en) 1995-03-31 1998-11-24 International Business Machines Corporation System and method for parallel mining of association rules in databases
US6601048B1 (en) * 1997-09-12 2003-07-29 Mci Communications Corporation System and method for detecting and managing fraud
US5966072A (en) 1996-07-02 1999-10-12 Ab Initio Software Corporation Executing computations expressed as graphs
US5778373A (en) 1996-07-15 1998-07-07 At&T Corp Integration of an information server database schema by generating a translation map from exemplary files
US6138123A (en) * 1996-07-25 2000-10-24 Rathbun; Kyle R. Method for creating and using parallel data structures
JPH1055367A (ja) 1996-08-09 1998-02-24 Hitachi Ltd データ利用システム
US5845285A (en) 1997-01-07 1998-12-01 Klein; Laurence C. Computer system and method of data analysis
US5987453A (en) 1997-04-07 1999-11-16 Informix Software, Inc. Method and apparatus for performing a join query in a database system
US6134560A (en) 1997-12-16 2000-10-17 Kliebhan; Daniel F. Method and apparatus for merging telephone switching office databases
US6826556B1 (en) * 1998-10-02 2004-11-30 Ncr Corporation Techniques for deploying analytic models in a parallel
US6959300B1 (en) * 1998-12-10 2005-10-25 At&T Corp. Data compression method and apparatus
US6343294B1 (en) 1998-12-15 2002-01-29 International Business Machines Corporation Data file editor for multiple data subsets
JP4037001B2 (ja) * 1999-02-23 2008-01-23 三菱電機株式会社 データベース作成装置およびデータベース検索装置
US6741995B1 (en) * 1999-03-23 2004-05-25 Metaedge Corporation Method for dynamically creating a profile
US6430539B1 (en) * 1999-05-06 2002-08-06 Hnc Software Predictive modeling of consumer financial behavior
US6163774A (en) 1999-05-24 2000-12-19 Platinum Technology Ip, Inc. Method and apparatus for simplified and flexible selection of aggregate and cross product levels for a data warehouse
WO2000079415A2 (en) 1999-06-18 2000-12-28 Torrent Systems, Inc. Segmentation and processing of continuous data streams using transactional semantics
US6801938B1 (en) * 1999-06-18 2004-10-05 Torrent Systems, Inc. Segmentation and processing of continuous data streams using transactional semantics
JP3318834B2 (ja) 1999-07-30 2002-08-26 三菱電機株式会社 データファイルシステム及びデータ検索方法
JP3567861B2 (ja) 2000-07-07 2004-09-22 日本電信電話株式会社 情報源所在推定方法及び装置及び情報源所在推定プログラムを格納した記憶媒体
JP4366845B2 (ja) * 2000-07-24 2009-11-18 ソニー株式会社 データ処理装置およびデータ処理方法、並びにプログラム提供媒体
US6788302B1 (en) 2000-08-03 2004-09-07 International Business Machines Corporation Partitioning and load balancing graphical shape data for parallel applications
US20020073138A1 (en) * 2000-12-08 2002-06-13 Gilbert Eric S. De-identification and linkage of data records
US6952693B2 (en) 2001-02-23 2005-10-04 Ran Wolff Distributed mining of association rules
US20020161778A1 (en) 2001-02-24 2002-10-31 Core Integration Partners, Inc. Method and system of data warehousing and building business intelligence using a data storage model
US20020120602A1 (en) 2001-02-28 2002-08-29 Ross Overbeek System, method and computer program product for simultaneous analysis of multiple genomes
JP2002269114A (ja) * 2001-03-14 2002-09-20 Kousaku Ookubo 知識データベース及び知識データベースの構築方法
US20030033138A1 (en) * 2001-07-26 2003-02-13 Srinivas Bangalore Method for partitioning a data set into frequency vectors for clustering
US7130852B2 (en) 2001-07-27 2006-10-31 Silicon Valley Bank Internal security system for a relational database system
AU2002355530A1 (en) * 2001-08-03 2003-02-24 John Allen Ananian Personalized interactive digital catalog profiling
US6801903B2 (en) 2001-10-12 2004-10-05 Ncr Corporation Collecting statistics in a database system
US20030140027A1 (en) * 2001-12-12 2003-07-24 Jeffrey Huttel Universal Programming Interface to Knowledge Management (UPIKM) database system with integrated XML interface
US7813937B1 (en) * 2002-02-15 2010-10-12 Fair Isaac Corporation Consistency modeling of healthcare claims to detect fraud and abuse
US7031969B2 (en) 2002-02-20 2006-04-18 Lawrence Technologies, Llc System and method for identifying relationships between database records
EP1488646B1 (en) * 2002-03-19 2017-05-03 Mapinfo Corporation Location based service provider
US20040083199A1 (en) * 2002-08-07 2004-04-29 Govindugari Diwakar R. Method and architecture for data transformation, normalization, profiling, cleansing and validation
US6657568B1 (en) 2002-08-27 2003-12-02 Fmr Corp. Data packing for real-time streaming
US7047230B2 (en) * 2002-09-09 2006-05-16 Lucent Technologies Inc. Distinct sampling system and a method of distinct sampling for optimizing distinct value query estimates
WO2004036461A2 (en) * 2002-10-14 2004-04-29 Battelle Memorial Institute Information reservoir
US7698163B2 (en) * 2002-11-22 2010-04-13 Accenture Global Services Gmbh Multi-dimensional segmentation for use in a customer interaction
US7403942B1 (en) * 2003-02-04 2008-07-22 Seisint, Inc. Method and system for processing data records
US7117222B2 (en) * 2003-03-13 2006-10-03 International Business Machines Corporation Pre-formatted column-level caching to improve client performance
US7433861B2 (en) * 2003-03-13 2008-10-07 International Business Machines Corporation Byte-code representations of actual data to reduce network traffic in database transactions
US20040249810A1 (en) * 2003-06-03 2004-12-09 Microsoft Corporation Small group sampling of data for use in query processing
GB0314591D0 (en) 2003-06-21 2003-07-30 Ibm Profiling data in a data store
US7426520B2 (en) 2003-09-10 2008-09-16 Exeros, Inc. Method and apparatus for semantic discovery and mapping between data sources
AU2004275334B9 (en) 2003-09-15 2011-06-16 Ab Initio Technology Llc. Data Profiling
US7587394B2 (en) * 2003-09-23 2009-09-08 International Business Machines Corporation Methods and apparatus for query rewrite with auxiliary attributes in query processing operations
US7149736B2 (en) 2003-09-26 2006-12-12 Microsoft Corporation Maintaining time-sorted aggregation records representing aggregations of values from multiple database records using multiple partitions
WO2005050482A1 (en) 2003-10-21 2005-06-02 Nielsen Media Research, Inc. Methods and apparatus for fusing databases
US20050177578A1 (en) 2004-02-10 2005-08-11 Chen Yao-Ching S. Efficient type annontation of XML schema-validated XML documents without schema validation
US7376656B2 (en) * 2004-02-10 2008-05-20 Microsoft Corporation System and method for providing user defined aggregates in a database system
US8447743B2 (en) * 2004-08-17 2013-05-21 International Business Machines Corporation Techniques for processing database queries including user-defined functions
US7774346B2 (en) 2005-08-26 2010-08-10 Oracle International Corporation Indexes that are based on bitmap values and that use summary bitmap values
US20070073721A1 (en) 2005-09-23 2007-03-29 Business Objects, S.A. Apparatus and method for serviced data profiling operations
US8271452B2 (en) 2006-06-12 2012-09-18 Rainstor Limited Method, system, and database archive for enhancing database archiving
US8412713B2 (en) 2007-03-06 2013-04-02 Mcafee, Inc. Set function calculation in a database
US7912867B2 (en) * 2008-02-25 2011-03-22 United Parcel Services Of America, Inc. Systems and methods of profiling data for integration
US9251212B2 (en) 2009-03-27 2016-02-02 Business Objects Software Ltd. Profiling in a massive parallel processing environment
CA2771899C (en) 2009-09-16 2017-08-01 Ab Initio Technology Llc Mapping dataset elements
KR101755365B1 (ko) 2009-11-13 2017-07-10 아브 이니티오 테크놀로지 엘엘시 레코드 포맷 정보의 관리
US8396873B2 (en) 2010-03-10 2013-03-12 Emc Corporation Index searching using a bloom filter
US8296274B2 (en) 2011-01-27 2012-10-23 Leppard Andrew Considering multiple lookups in bloom filter decision making
WO2012103438A1 (en) 2011-01-28 2012-08-02 Ab Initio Technology Llc Generating data pattern information
US8610605B2 (en) 2011-06-17 2013-12-17 Sap Ag Method and system for data compression
US8762396B2 (en) 2011-12-22 2014-06-24 Sap Ag Dynamic, hierarchical bloom filters for network data routing

Also Published As

Publication number Publication date
KR20080016532A (ko) 2008-02-21
US20150106341A1 (en) 2015-04-16
US20050114369A1 (en) 2005-05-26
AU2004275334B9 (en) 2011-06-16
KR20070106574A (ko) 2007-11-01
KR20060080588A (ko) 2006-07-10
CA2655735C (en) 2011-01-18
CN102982065B (zh) 2016-09-21
CA2655731C (en) 2012-04-10
WO2005029369A9 (en) 2006-05-04
JP5372850B2 (ja) 2013-12-18
CA2655735A1 (en) 2005-03-31
EP2261821B1 (en) 2022-12-07
US8868580B2 (en) 2014-10-21
KR100899850B1 (ko) 2009-05-27
KR20090039803A (ko) 2009-04-22
WO2005029369A3 (en) 2005-08-25
EP1676217B1 (en) 2011-07-06
CA2538568C (en) 2009-05-19
CA2538568A1 (en) 2005-03-31
US20160239532A1 (en) 2016-08-18
AU2004275334A1 (en) 2005-03-31
US7756873B2 (en) 2010-07-13
US9323802B2 (en) 2016-04-26
AU2009200293A1 (en) 2009-02-19
JP2010267288A (ja) 2010-11-25
EP2261820A2 (en) 2010-12-15
AU2009200293B2 (en) 2011-07-07
CA2655731A1 (en) 2005-03-31
AU2009200294A1 (en) 2009-02-19
EP2261821A2 (en) 2010-12-15
US20050102325A1 (en) 2005-05-12
AU2004275334B2 (en) 2011-02-10
ATE515746T1 (de) 2011-07-15
US20050114368A1 (en) 2005-05-26
US7849075B2 (en) 2010-12-07
JP5328099B2 (ja) 2013-10-30
CN102982065A (zh) 2013-03-20
JP2007506191A (ja) 2007-03-15
EP1676217A2 (en) 2006-07-05
WO2005029369A2 (en) 2005-03-31
EP2261820A3 (en) 2010-12-29
JP5372851B2 (ja) 2013-12-18
KR100922141B1 (ko) 2009-10-19
KR101033179B1 (ko) 2011-05-11
JP2010267289A (ja) 2010-11-25
EP2261821A3 (en) 2010-12-29

Similar Documents

Publication Publication Date Title
HK1093568A1 (en) Data profiling
WO2005050367A3 (en) Systems and methods for search query processing using trend analysis
DE60330955D1 (de) Verfahren und Computersystem zur Abfrageverarbeitung
CA2333381A1 (en) Data processing system and method for organizing, analyzing, recording, storing and reporting research results
CA2487999A1 (en) Behavior-based adaptation of computer systems
WO2004061582A3 (en) Method and system for parts analysis
FR2823337B1 (fr) Procede de lecture, traitement, transmission et exploitation d'un code a barres
EP0569133A3 (xx)
DE69811832T2 (de) Verfahren zur Schätzung von Statistiken der Eigenschaften von durch eine Prozessorpipeline bearbeiteten Wechselwirkungen
WO2005009208A3 (en) System, method, and apparatus for evaluating a person's athletic ability
FR2750519B1 (fr) Procede d'extraction de documents de bases de donnees
WO2007060664A3 (en) System and method of managing data protection resources
IL150064A0 (en) Timeshared electronic catalog system and method
EP1376397A3 (en) Method of extracting item patterns across a plurality of databases, a network system and a processing apparatus
MX9205058A (es) Tecla de procesamiento de datos de funciones multiples y estados multiples, y arreglo de tecla.
DE69109856D1 (de) Vorrichtung zur multidimensionalen informationseingabe.
EP1039398A3 (en) Scheme for systematically registering meta-data with respect to various types of data
AU2001269792A1 (en) System, method and computer program product for reading, correlating, processing, categorizing and aggregating events of any type
WO2005020125A3 (en) Methods and systems for profiling biological systems
EP0407050A3 (en) Computer systems including a process database
DE60003253D1 (de) Vorrichtung und Verfahren zur Vorbeugung von Computerprogrammänderungen und entsprechender Computerprogrammdatenträger.
FR2708766B1 (fr) Procédé d'analyse d'interblocages dans un système d'exploitation.
WO2004066115A3 (en) Improved interface for modifying data fields in a mark-up language environment
IL159633A0 (en) Method and system for reorganizing a tablespace in a database
WO2004031896A3 (en) System and method for accessing medical records