From Corpora list announcement:

The error-annotated German learner corpus Falko has released a new subcorpus: FalkoEssayL2WHIGv2.0 including 195 argumentative essays by advanced learners of German (117,189 tokens).

For each text two full-text target hypotheses (a minimal morphosyntactic normalization and an extended semantic-pragmatic version) have been manually annotated.

Each representation has been POS-tagged and lemmatized (Treetagger & rfTagger). rfTagger morphological annotation has been integrated as well.

On this basis, tags indicating differences between the learner text and its POS and lemma annotations and the respective target hypotheses (POS & lemma) have been added.

The corpus is freely available under the following link:


The annotation guidelines can be found here:
http://www.linguistik.hu-berlin.de/institut/professuren/korpuslinguistik/forschung/falko/Falko-Handbuchv2.0.pdf Linguistic Field: Language
Acquisition, Text/Corpus Linguistics


Marc Reznicek
Wiss. Mitarbeiter
Humboldt-Universität zu Berlin
Tel: +49 (0)30 2093-9727
Dorotheenstr.24, 10099 Berlin
Raum 3.310