TY  - CONF
ID  - www20098
UR  - http://www2009.eprints.org/8/
A1  - Li, Liangda
A1  - Zhou, Ke
A1  - Xue, Gui-Rong
A1  - Zha, Hongyuan
A1  - Yu, Yong
Y1  - 2009/04//
N2  - Document summarization plays an increasingly important role with the exponential growth of documents on the Web. Many supervised and unsupervised approaches have been proposed to generate summaries from documents. However, these approaches seldom simultaneously consider summary diversity, coverage, and balance issues which to a large extent determine the quality of summaries. In this paper, we consider extract-based summarization emphasizing the following three requirements: 1) diversity in summarization, which seeks to reduce redundancy among sentences in the summary; 2) sufficient coverage, which focuses on avoiding the loss of the document?s main information when generating the summary; and 3) balance, which demands that different aspects of the document need to have about the same relative importance in the summary. We formulate the extract-based summarization problem as learning a mapping from a set of sentences of a given document to a subset of the sentences that satis?es the above three requirements. The mapping is learned by incorporating several constraints in a structure learning framework, and we explore the graph structure of the output variables and employ structural SVM for solving the resulted optimization problem. Experiments on the DUC2001 data sets demonstrate signi?cant performance improvements in terms of F1 and ROUGE metrics.
TI  - Enhancing Diversity, Coverage and Balance for Summarization through Structure Learning
SP  - 71
M2  - Madrid, Spain
AV  - public
EP  - 71
T2  - 18th International World Wide Web Conference
ER  -