?url_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rft.title=Mining+the+Web+for+Synonyms%3A+PMI-IR+versus+LSA+on+TOEFL&rft.creator=Turney%2C+Peter&rft.subject=Language&rft.subject=Machine+Learning&rft.subject=Statistical+Models&rft.description=This+paper+presents+a+simple+unsupervised+learning+algorithm+for+recognizing+synonyms%2C+based+on+statistical+data+acquired+by+querying+a+Web+search+engine.+The+algorithm%2C+called+PMI-IR%2C+uses+Pointwise+Mutual+Information+(PMI)+and+Information+Retrieval+(IR)+to+measure+the+similarity+of+pairs+of+words.+PMI-IR+is+empirically+evaluated+using+80+synonym+test+questions+from+the+Test+of+English+as+a+Foreign+Language+(TOEFL)+and+50+synonym+test+questions+from+a+collection+of+tests+for+students+of+English+as+a+Second+Language+(ESL).+On+both+tests%2C+the+algorithm+obtains+a+score+of+74%25.+PMI-IR+is+contrasted+with+Latent+Semantic+Analysis+(LSA)%2C+which+achieves+a+score+of+64%25+on+the+same+80+TOEFL+questions.+The+paper+discusses+potential+applications+of+the+new+unsupervised+learning+algorithm+and+some+implications+of+the+results+for+LSA+and+LSI+(Latent+Semantic+Indexing).+%0A%0A&rft.publisher=Springer-Verlag&rft.contributor=De+Raedt%2C+Luc&rft.contributor=Flach%2C+Peter&rft.date=2001&rft.type=Conference+Paper&rft.type=PeerReviewed&rft.format=application%2Fpostscript&rft.identifier=http%3A%2F%2Fcogprints.org%2F1796%2F1%2FECML2001.ps&rft.format=application%2Fpdf&rft.identifier=http%3A%2F%2Fcogprints.org%2F1796%2F5%2FECML2001.pdf&rft.identifier=++Turney%2C+Peter++(2001)+Mining+the+Web+for+Synonyms%3A+PMI-IR+versus+LSA+on+TOEFL.++%5BConference+Paper%5D+++++&rft.relation=http%3A%2F%2Fcogprints.org%2F1796%2F