The following is a history of the different
corpora, as well as a list of some upcoming changes.
|
2009. Fall ? |
Improvements to the corpus architecture and
interface, including:
1. Improved search syntax, with optional / variable length
elements in the search string
2. Chart displays with relative frequencies of competing forms (who
/ whom, help V / help to V, etc)
3. Improved KWIC (context) displays, including full sources for
all entries
4. "Random word" function, with useful features for language
learners (done) |
|
2009. Aug |
Added about 15 million words to the
Corpus of Contemporary
American English (COCA), for Oct 2008 - Jun 2009. |
|
2009. May |
1. Redesigned the main
http://corpus.byu.edu/
portal, and added new tools for collaboration.
2. Added new features to the corpus interface: history of
searches, and ability to annotate queries (notes) and share them
with others |
|
2009. Mar |
Awarded a grant from the
National Endowment for the
Humanities to create the Corpus of Historical American
English (COHA), a balanced, 300 million word corpus of American
English, early 1800s - present time. A beta version of this
corpus will be available in August 2010. (More
information...) |
|
2008. Oct |
Added about 15 million words to the
Corpus of Contemporary
American English (COCA), for Jan-Sep 2008. |
|
2008. June |
Applied the new architecture to the
Corpus do Português |
|
2008. Apr |
Applied the new architecture to the
British National Corpus
and the TIME Corpus |
|
2008. Mar |
Released the
Corpus of Contemporary
American English |
|
2007. Oct |
Finished new (current) corpus architecture;
applied it to the
Corpus del Español. Major updates in this corpus as well,
including much-improved tagging and lemmatization for Modern
Spanish. |
|
2007. May |
Released the
TIME Corpus of American
English |
|
2006. Aug |
Released the
Corpus do Português |
|
2005. Apr |
Interface for
Register
Variation in Spanish |
|
2004. Apr |
Released VIEW, our first version of the
British National Corpus |
|
2002. Sep |
Released the first version of the
Corpus del Español |
| Misc |
There are several other corpora with older,
non-standard architecture and interface:
Polyglot
Bible,
Polyglot Book of Mormon, and
LDS General
Conferences,
Medieval Spanish bibles, and
Latin/OSp/ModSp bibles |