WO2006037747A3 - Method for structuring a data stock that is stored on at least one storage medium - Google Patents

Method for structuring a data stock that is stored on at least one storage medium Download PDF

Info

Publication number
WO2006037747A3
WO2006037747A3 PCT/EP2005/054891 EP2005054891W WO2006037747A3 WO 2006037747 A3 WO2006037747 A3 WO 2006037747A3 EP 2005054891 W EP2005054891 W EP 2005054891W WO 2006037747 A3 WO2006037747 A3 WO 2006037747A3
Authority
WO
WIPO (PCT)
Prior art keywords
parametric
data
structuring
stored
data records
Prior art date
Application number
PCT/EP2005/054891
Other languages
German (de)
French (fr)
Other versions
WO2006037747A2 (en
Inventor
Volker Tresp
Kai Yu
Shipeng Yu
Original Assignee
Siemens Ag
Volker Tresp
Kai Yu
Shipeng Yu
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Ag, Volker Tresp, Kai Yu, Shipeng Yu filed Critical Siemens Ag
Publication of WO2006037747A2 publication Critical patent/WO2006037747A2/en
Publication of WO2006037747A3 publication Critical patent/WO2006037747A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • Operations Research (AREA)
  • Databases & Information Systems (AREA)
  • Algebra (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Complex Calculations (AREA)

Abstract

The invention relates to a non-parametric Bayes method for analysing data records, in which elements occur with a specific frequency. The installed model retains the size of earlier extensions, in which latent factors of each data record (e.g. themes of documents) were investigated, whilst at the same time permitting the investigation of the cluster structures of data records, which reflect the statistical dependency of the latent factors. Compared to parametric Bayes modelling, the non-parametric model that is induced by a Dirichlet process (DP) is sufficiently flexible to reveal the data structure. Instead of having to use the Markov chain Monte Carlo (MCMC), which is slow with our specifications, the inventive method introduces an efficient variational inference, which is based on a finite, highly-dimensioned approximation of (DP).
PCT/EP2005/054891 2004-10-04 2005-09-28 Method for structuring a data stock that is stored on at least one storage medium WO2006037747A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE102004048272.1 2004-10-04
DE102004048272 2004-10-04

Publications (2)

Publication Number Publication Date
WO2006037747A2 WO2006037747A2 (en) 2006-04-13
WO2006037747A3 true WO2006037747A3 (en) 2007-05-31

Family

ID=35985416

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2005/054891 WO2006037747A2 (en) 2004-10-04 2005-09-28 Method for structuring a data stock that is stored on at least one storage medium

Country Status (1)

Country Link
WO (1) WO2006037747A2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105426647B (en) * 2016-01-18 2018-08-07 中国人民解放军国防科学技术大学 Cold stand-by systems reliablity estimation method based on the fusion of reliability prior information

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BLEI D M ET AL: "Hierarchical topic models and the nested Chinese restaurant process", ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NIPS) 16 (THRUN S ET AL EDITORS), 2004, Cambridge, MA, MIT Press, XP002427365, Retrieved from the Internet <URL:http://www.cs.princeton.edu/~blei/papers/BleiGriffithsJordanTenenbaum2003.pdf> [retrieved on 20070329] *
BLEI D M ET AL: "Latent Dirichlet Allocation", JOURNAL OF MACHINE LEARNING RESEARCH, vol. 3, January 2003 (2003-01-01), pages 993 - 1022, XP002427366, Retrieved from the Internet <URL:http://portal.acm.org/citation.cfm?id=944937&dl=GUIDE,> [retrieved on 20070329] *
BLEI D M ET AL: "Variational methods for the Dirichlet process", 21ST INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2004), 4-8 JULY 2004, BANFF, CANADA, 4 July 2004 (2004-07-04), XP002427364, Retrieved from the Internet <URL:http://portal.acm.org/citation.cfm?id=1015330.1015439> [retrieved on 20070329] *
YU K ET AL: "Dirichlet enhanced latent semantic analysis", 10TH INTERNATIONAL WORKSHOP ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS-05), 6-8 JANUARY 2005, BARBADOS, 6 January 2005 (2005-01-06), XP002427363, Retrieved from the Internet <URL:http://www.dbs.informatik.uni-muenchen.de/~yu_k/aistat2005l.pdf> [retrieved on 20070329] *

Also Published As

Publication number Publication date
WO2006037747A2 (en) 2006-04-13

Similar Documents

Publication Publication Date Title
Moritz Göhler et al. Robustness metrics: Consolidating the multiple approaches to quantify robustness
Crocker et al. Information density and linguistic encoding (ideal)
CN101770580B (en) Training method and classification method of cross-field text sentiment classifier
WO2008094848A3 (en) Apparatus and method for data charting with an extensible visualization library
JP2011118765A5 (en)
Naga et al. Analyzing the effect of moving resonance on seismic response of structures using wavelet transforms
Zhang et al. Exact Finite Difference Scheme and Nonstandard Finite Difference Scheme for Burgers and Burgers‐Fisher Equations
Almasov et al. Life-cycle optimization of the carbon dioxide huff-n-puff process in an unconventional oil reservoir using least-squares support vector and Gaussian process regression proxies
Daumé III et al. A Bayesian model for discovering typological implications
WO2008094486A3 (en) Reporting fixed-point information for a graphical mode
WO2006037747A3 (en) Method for structuring a data stock that is stored on at least one storage medium
US8452626B2 (en) Technology replacement cost estimation using environmental cost considerations
US20150339291A1 (en) Method and apparatus for performing bilingual word alignment
Chong et al. The Fine Structure of Volatility Dynamics
Bernstein Optimal prediction of Burgers’s equation
WO2005114540A3 (en) Antivirus product using in-kernel cache of file state
Haidar et al. Unsupervised language model adaptation using LDA-based mixture models and latent semantic marginals
Kumar et al. Length biased weighted residual inaccuracy measure
WO2009039275A3 (en) Geospatial modeling system providing wavelet decomposition and inpainting features and related methods
Rychlik Note on modelling of fatigue damage rates for non‐Gaussian stresses
Scarcella Recurrent neural network language models in the context of under-resourced South African languages
Beckmann et al. Cache calculus: Modeling caches through differential equations
Elinger et al. Practical Considerations for Use of Causation Entropy in Sparsity Identification
Schutte Numerical simulation of tyre/road noise
Gu et al. On the convergence rate of fixed design regression estimators for negatively associated random variables

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase