Goutte, Cyril and Missaoui, Rokia and Boujenoui, Ameur (2007) Data Cube Approximation and Mining using Probabilistic Modeling. [Departmental Technical Report]
Full text available as:
|
PDF
248Kb |
Abstract
On-line Analytical Processing (OLAP) techniques commonly used in data warehouses allow the exploration of data cubes according to different analysis axes (dimensions) and under different abstraction levels in a dimension hierarchy. However, such techniques are not aimed at mining multidimensional data. Since data cubes are nothing but multi-way tables, we propose to analyze the potential of two probabilistic modeling techniques, namely non-negative multi-way array factorization and log-linear modeling, with the ultimate objective of compressing and mining aggregate and multidimensional values. With the first technique, we compute the set of components that best fit the initial data set and whose superposition coincides with the original data; with the second technique we identify a parsimonious model (i.e., one with a reduced set of parameters), highlight strong associations among dimensions and discover possible outliers in data cells. A real life example will be used to (i) discuss the potential benefits of the modeling output on cube exploration and mining, (ii) show how OLAP queries can be answered in an approximate way, and (iii) illustrate the strengths and limitations of these modeling approaches.
Item Type: | Departmental Technical Report |
---|---|
Keywords: | data cubes, OLAP, data warehouses, multidimensional data, non-negative multi-way array factorization, log-linear modeling |
Subjects: | Computer Science > Machine Learning Computer Science > Artificial Intelligence |
ID Code: | 5622 |
Deposited By: | Goutte, Dr. Cyril |
Deposited On: | 28 Jul 2007 |
Last Modified: | 11 Mar 2011 08:56 |
References in Article
Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.
Metadata
- ASCII Citation
- Atom
- BibTeX
- Dublin Core
- EP3 XML
- EPrints Application Profile (experimental)
- EndNote
- HTML Citation
- ID Plus Text Citation
- JSON
- METS
- MODS
- MPEG-21 DIDL
- OpenURL ContextObject
- OpenURL ContextObject in Span
- RDF+N-Triples
- RDF+N3
- RDF+XML
- Refer
- Reference Manager
- Search Data Dump
- Simple Metadata
- YAML
Repository Staff Only: item control page