Zur Navigation | Zum Inhalt
FVCML0208 10
PatTR: Patent Translation Resource PDF Stampa E-mail
Lunedì 15 Aprile 2013 21:43

A parallel corpus of patent text for the German-English language pair.

The corpus has been constructed from EPO, WIPO and USPTO patent documents extracted from the MAREC collection and contains 23 million sentence pairs from all patent text sections.

All sentences are labeled with metadata: patent document id, patent family, patent classification and publication date.

The corpus is distributed under a Creative Commons License. For more information and download, please see
http://www.cl.uni-heidelberg.de/statnlpgroup/pattr