Missaoui, Rokia and Goutte, Cyril and Kouomou Choupo, Anicet and Boujenoui, Ameur (2007) A Probabilistic Model for Data Cube Compression and Query Approximation. [Conference Paper] (In Press)
Full text available as:
|
PDF
178Kb |
Abstract
Databases and data warehouses contain an overwhelming volume of information that users must wade through in order to extract valuable and actionable knowledge to support the decision-making process. This contribution addresses the problem of automatically analyzing large multidimensional tables to get a concise representation of data, identify patterns and provide approximate answers to queries. Since data cubes are nothing but multi-way tables, we propose to analyze the potential of a probabilistic modeling technique, called non-negative multi-way array factorization, for approximating aggregate and multidimensional values. Using such a technique, we compute the set of components (clusters) that best fit the initial data set and whose superposition approximates the original data. The generated components can then be exploited for approximately answering OLAP queries such as roll-up, slice and dice operations. The proposed modeling technique will then be compared against the log-linear modeling technique which has already been used in the literature for compression and outlier detection in data cubes. Finally, three data sets will be used to discuss the potential benefits of non-negative multi-way array factorization.
Item Type: | Conference Paper |
---|---|
Keywords: | Data warehousing, data cubes, OLAP, approximation, compression, data mining, non-negative multi-way array factorization, log-linear modeling |
Subjects: | Computer Science > Statistical Models Computer Science > Artificial Intelligence |
ID Code: | 5702 |
Deposited By: | Goutte, Dr. Cyril |
Deposited On: | 10 Sep 2007 |
Last Modified: | 11 Mar 2011 08:56 |
References in Article
Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.
Metadata
- ASCII Citation
- Atom
- BibTeX
- Dublin Core
- EP3 XML
- EPrints Application Profile (experimental)
- EndNote
- HTML Citation
- ID Plus Text Citation
- JSON
- METS
- MODS
- MPEG-21 DIDL
- OpenURL ContextObject
- OpenURL ContextObject in Span
- RDF+N-Triples
- RDF+N3
- RDF+XML
- Refer
- Reference Manager
- Search Data Dump
- Simple Metadata
- YAML
Repository Staff Only: item control page