?url_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rft.title=Learning+to+Extract+Keyphrases+from+Text&rft.creator=Turney%2C+Peter&rft.subject=Archives&rft.subject=Language&rft.subject=Machine+Learning&rft.subject=Statistical+Models&rft.description=Many+academic+journals+ask+their+authors+to+provide+a+list+of+about+five+to+fifteen+key+words%2C+to+appear+on+the+first+page+of+each+article.+Since+these+key+words+are+often+phrases+of+two+or+more+words%2C+we+prefer+to+call+them+keyphrases.+There+is+a+surprisingly+wide+variety+of+tasks+for+which+keyphrases+are+useful%2C+as+we+discuss+in+this+paper.+Recent+commercial+software%2C+such+as+Microsoft%3Fs+Word+97+and+Verity%3Fs+Search+97%2C+includes+algorithms+that+automatically+extract+keyphrases+from+documents.+In+this+paper%2C+we+approach+the+problem+of+automatically+extracting+keyphrases+from+text+as+a+supervised+learning+task.+We+treat+a+document+as+a+set+of+phrases%2C+which+the+learning+algorithm+must+learn+to+classify+as+positive+or+negative+examples+of+keyphrases.+Our+first+set+of+experiments+applies+the+C4.5+decision+tree+induction+algorithm+to+this+learning+task.+The+second+set+of+experiments+applies+the+GenEx+algorithm+to+the+task.+We+developed+the+GenEx+algorithm+specifically+for+this+task.+The+third+set+of+experiments+examines+the+performance+of+GenEx+on+the+task+of+metadata+generation%2C+relative+to+the+performance+of+Microsoft%3Fs+Word+97.+The+fourth+and+final+set+of+experiments+investigates+the+performance+of+GenEx+on+the+task+of+highlighting%2C+relative+to+Verity%3Fs+Search+97.+The+experimental+results+support+the+claim+that+a+specialized+learning+algorithm+(GenEx)+can+generate+better+keyphrases+than+a+general-purpose+learning+algorithm+(C4.5)+and+the+non-learning+algorithms+that+are+used+in+commercial+software+(Word+97+and+Search+97).+&rft.date=1999&rft.type=Departmental+Technical+Report&rft.type=NonPeerReviewed&rft.format=application%2Fpostscript&rft.identifier=http%3A%2F%2Fcogprints.org%2F1802%2F1%2FERB-1057.ps&rft.format=application%2Fpdf&rft.identifier=http%3A%2F%2Fcogprints.org%2F1802%2F5%2FERB-1057.pdf&rft.identifier=++Turney%2C+Peter++(1999)+Learning+to+Extract+Keyphrases+from+Text.++%5BDepartmental+Technical+Report%5D++++(Unpublished)++&rft.relation=http%3A%2F%2Fcogprints.org%2F1802%2F