This site has been permanently archived. This is a static copy provided by the University of Southampton.
<> "The repository administrator has not yet configured an RDF license."^^ .
<> .
.
"EXPLOITING N-GRAM IMPORTANCE AND ADDITIONAL KNOWEDGE BASED ON WIKIPEDIA FOR IMPROVEMENTS IN GAAC BASED DOCUMENT CLUSTERING"^^ .
"This paper provides a solution to the issue: “How can we use Wikipedia based concepts in document\r\nclustering with lesser human involvement, accompanied by effective improvements in result?” In the\r\ndevised system, we propose a method to exploit the importance of N-grams in a document and use\r\nWikipedia based additional knowledge for GAAC based document clustering. The importance of N-grams\r\nin a document depends on several features including, but not limited to: frequency, position of their\r\noccurrence in a sentence and the position of the sentence in which they occur, in the document. First, we\r\nintroduce a new similarity measure, which takes the weighted N-gram importance into account, in the\r\ncalculation of similarity measure while performing document clustering. As a result, the chances of topical similarity in clustering are improved. Second, we use Wikipedia as an additional knowledge base both, to remove noisy entries from the extracted N-grams and to reduce the information gap between N-grams that are conceptually-related, which do not have a match owing to differences in writing scheme or strategies. Our experimental results on the publicly available text dataset clearly show that our devised system has a significant improvement in performance over bag-of-words based state-of-the-art systems in this area."^^ .
"2010-10-25" .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
"Vasudeva"^^ .
"Varma"^^ .
"Vasudeva Varma"^^ .
.
"Niraj "^^ .
"Kumar"^^ .
"Niraj Kumar"^^ .
.
"Kannan"^^ .
"Srinathan"^^ .
"Kannan Srinathan"^^ .
.
"Venkata Vinay Babu"^^ .
"Vemula"^^ .
"Venkata Vinay Babu Vemula"^^ .
.
.
.
.
.
"EXPLOITING N-GRAM IMPORTANCE AND ADDITIONAL KNOWEDGE BASED ON WIKIPEDIA FOR IMPROVEMENTS IN GAAC BASED DOCUMENT CLUSTERING (PDF)"^^ .
.
.
.
.
.
.
.
.
.
.
.
"KDIR_Niraj.pdf"^^ .
.
.
"EXPLOITING N-GRAM IMPORTANCE AND ADDITIONAL KNOWEDGE BASED ON WIKIPEDIA FOR IMPROVEMENTS IN GAAC BASED DOCUMENT CLUSTERING (Image (JPEG))"^^ .
.
.
.
.
.
"preview.jpg"^^ .
.
.
"EXPLOITING N-GRAM IMPORTANCE AND ADDITIONAL KNOWEDGE BASED ON WIKIPEDIA FOR IMPROVEMENTS IN GAAC BASED DOCUMENT CLUSTERING (Indexer Terms)"^^ .
.
.
.
.
.
"indexcodes.txt"^^ .
.
"HTML Summary of #7148 \n\nEXPLOITING N-GRAM IMPORTANCE AND ADDITIONAL KNOWEDGE BASED ON WIKIPEDIA FOR IMPROVEMENTS IN GAAC BASED DOCUMENT CLUSTERING\n\n" .
"text/html" .
.
.
"Statistical Models" .
.