Portuguese: Corpus do Português (45 million words, 1300-1999)

Authors: Mark Davies and Michael Ferreira

Summary

This corpus allows you to quickly and easily search more than 45 million words in more than 50,000 Portuguese texts from the 1300s to the 1900s. The interface allows you to search for exact words or phrases, substrings, lemmas, part of speech, or any combinations of these. You can also search for surrounding words (collocates) within a ten-word window. This corpus has been funded by the US National Endowment for the Humanities, and is now freely available online.

Format:

Website

Access to materials

Visit the Corpus do Português website
www.corpusdoportugues.org

The corpus also allows you to easily compare (and see, via charts) the frequency of and distribution of words, phrases, and grammatical constructions across texts, in at least three ways:

By register: comparisons between spoken, fiction, newspaper, and academic
By dialect: Portugal compared with Brazil
By historical period: compare different centuries from the 1300s to the 1900s

You can also easily carry out semantically-based queries of the corpus. For example, you can compare and contrast the collocates of two related words to determine the difference in meaning between these words. You can find the frequency and distribution of synonyms for more than 20,000 words and also compare their frequency in different registers, countries, and historical periods, and use these word lists as part of other queries. Finally, you can easily create your own lists of semantically-related words, and then use them directly as part of the query.

#LLASconf2016 Tweets

Humbox

The Humbox is a humanities teaching resource repository jointly managed by LLAS.

LLAS Centre for Languages, Linguistics and Area Studies

Portuguese: Corpus do Português (45 million words, 1300-1999)

Summary

Format:

Access to materials

Humbox

Online Resources

Materials Bank

Look for similar items by theme:

More about us

Help

Join in

Syndicate and re-use

LLAS Centre for Languages, Linguistics and Area Studies

Portuguese: Corpus do Portugu&#234;s (45 million words, 1300-1999)

Summary

Format:

Access to materials

Humbox

Online Resources

Materials Bank

Look for similar items by theme:

Share

More about us

Help

Join in

Syndicate and re-use

Portuguese: Corpus do Português (45 million words, 1300-1999)