Turney, Peter D. (2004) Human-Level Performance on Word Analogy Questions by Latent Relational Analysis. [Departmental Technical Report] (Unpublished)
Full text available as:
|
PDF
217Kb |
Abstract
This paper introduces Latent Relational Analysis (LRA), a method for measuring relational similarity. LRA has potential applications in many areas, including information extraction, word sense disambiguation, machine translation, and information retrieval. Relational similarity is correspondence between relations, in contrast with attributional similarity, which is correspondence between attributes. When two words have a high degree of attributional similarity, we call them synonyms. When two pairs of words have a high degree of relational similarity, we say that their relations are analogous. For example, the word pair mason/stone is analogous to the pair carpenter/wood; the relations between mason and stone are highly similar to the relations between carpenter and wood. Past work on semantic similarity measures has mainly been concerned with attributional similarity. For instance, Latent Semantic Analysis (LSA) can measure the degree of similarity between two words, but not between two relations. Recently the Vector Space Model (VSM) of information retrieval has been adapted to the task of measuring relational similarity, achieving a score of 47% on a collection of 374 college-level multiple-choice word analogy questions. In the VSM approach, the relation between a pair of words is characterized by a vector of frequencies of predefined patterns in a large corpus. LRA extends the VSM approach in three ways: (1) the patterns are derived automatically from the corpus (they are not predefined), (2) the Singular Value Decomposition (SVD) is used to smooth the frequency data (it is also used this way in LSA), and (3) automatically generated synonyms are used to explore reformulations of the word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the average human score of 57%. On the related problem of classifying noun-modifier relations, LRA achieves similar gains over the VSM, while using a smaller corpus.
Item Type: | Departmental Technical Report |
---|---|
Keywords: | analogies, semantic relations, vector space model, noun-modifier expressions, latent relational analysis |
Subjects: | Computer Science > Language Linguistics > Computational Linguistics Linguistics > Semantics Computer Science > Machine Learning |
ID Code: | 3981 |
Deposited By: | Turney, Peter |
Deposited On: | 11 Dec 2004 |
Last Modified: | 11 Mar 2011 08:55 |
References in Article
Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.
Metadata
- ASCII Citation
- Atom
- BibTeX
- Dublin Core
- EP3 XML
- EPrints Application Profile (experimental)
- EndNote
- HTML Citation
- ID Plus Text Citation
- JSON
- METS
- MODS
- MPEG-21 DIDL
- OpenURL ContextObject
- OpenURL ContextObject in Span
- RDF+N-Triples
- RDF+N3
- RDF+XML
- Refer
- Reference Manager
- Search Data Dump
- Simple Metadata
- YAML
Repository Staff Only: item control page