TY - GEN ID - cogprints5626 UR - http://cogprints.org/5626/ A1 - Goutte, Cyril TI - Fast & Confident Probabilistic Categorization Y1 - 2007/// N2 - We describe NRC's submission to the Anomaly Detection/Text Mining competition organised at the Text Mining Workshop 2007. This submission relies on a straightforward implementation of the probabilistic categoriser described in (Gaussier et al., ECIR'02). This categoriser is adapted to handle multiple labelling and a piecewise-linear confidence estimation layer is added to provide an estimate of the labelling confidence. This technique achieves a score of 1.689 on the test data. AV - public KW - Text categorization KW - probabilistic model KW - confidence estimation KW - multi-label categorization KW - category description ER -