Zur Navigation | Zum Inhalt
FVCML0208 10
GloWbE: Corpus of Global Web-Based English PDF Stampa E-mail
Lunedì 15 Aprile 2013 21:39

This new corpus is 1.9 billion words in size, and is based on 1.8 million web pages (including blogs) from 20 different English-speaking countries (US, UK, NZ, India, Hong Kong, etc). GloWbE is 4-5 times as large as COCA, and about 20 times as big as the BNC, and thus yields much richer data for some low-frequency constructions.
 
The real power of GloWbE, though, is the ability to see the frequency of any word, phrase, or grammatical construction in each of the 20 different countries. You can also compare any features in two sets of dialects, such as British and American English (in more than 775 million words of text for just these two dialects). Or you could just limit your search to one or two countries (e.g. Australia (148 million words), South Africa (45 million), or Singapore (43 million)), and you'll still be searching the largest online corpus for most of these twenty countries.

from M. Davies Corpora list announcement

GloWbE: Corpus of Global Web-Based English