Corpus Linguistics is now seen as the study of linguistic phenomena
through large collections of machine-readable texts: corpora. These
are used within a number of research
areas going from the Descriptive Study of the Syntax of a Language
to Prosody or Language Learning, to mention but a few. An over-view of some of
the areas where corpora have been used can be found on the Research areas page.
The use of real examples of texts in the study of language is not a
new issue in the history of linguistics. However, Corpus Linguistics has
considerably in the last decades due to the great possibilities offered
by the processing of natural language with computers. The availability of
computers and machine-readable text has made it possible to get data quickly and
easily and also to have this data presented in a format suitable for analysis.
Corpus linguistics is, however, not the same as mainly obtaining language data
through the use of computers. Corpus linguistics is the study and analysis of
data obtained from a corpus. The main task of the corpus linguist is not to find
the data but to analyse it. Computers are useful, and sometimes
indispensable, tools used in this process.
If you want to learn more about corpora and corpus linguistics you can use the
links below. On the Background page you can
follow the development of corpus linguistics through presentations of some
central corpora/kinds of corpora. On the
Working with Corpora page you will find information about things to
think about when you want to use corpora for language learning or research.
Tutorial to learn about how to make corpus searches and analyse the result
or go straight to the Search Engine to make online searches in a number of corpora.