Zur Navigation | Zum Inhalt
FVCML0208 10
TS Corpus PDF Stampa E-mail
Giovedì 30 Agosto 2012 15:11

[dalla descrizione su Corpora list]

TS Corpus is a Turkish Corpus project that is freely online available. TS Corpus is a general-purpose Turkish Corpus containing 491 million POSTagged tokens. TS Corpus is build and is being kept running by Taner Sezer. The corpus is based on CWB.


Corpus can be reached at:
http://tscorpus.com
NTS Corpus serves the following features:

TS Corpus is POStagged
TS Corpus has Morphologically annotation
TS Corpus involves the lemma form of the tokens
Key word in context view (KWIC)
Word & Lemma search
Frequency search
Regular expression search
Search with CQP Query
Case sensitive search
Building frequency list
Saving the results in different formats
New Features of the Second Version
Queries based on Morphological Annotation
Restricted query
Simplified POSTag set and disambiguation
Displaying POSTags on KWIC screen and morphological annotation on context view
Distribution of hit sets based on metadata restrictions
Hits sets are now can be categorised
Users can create subcorpora


Further information can be found on corpus web page at http://tscorpus.com and documentation on http://tscorpus.com/wiki