Retour à la fiche]
Metadonnées au format TEI
<?xml version="1.0" encoding="utf-8"?>
<title>MULTEXT-East corpus</title>
<author>Dept. of Knowledge Technologies - Jožef Stefan Institute</author>
<p>Non renseigné</p>
<language>Bulgarian, Croatian, Czech, English, Estonian, Hungarian, Lithuanian, Macedonian, Persian, Polish, Resian, Romanian, Russian, Serbian, Slovak, Slovene, and Ukrainian</language>
<p>Multilingual Text Tools and Corpora</p>
<p>The MULTEXT-East resources are a multilingual dataset for language engineering research and development. This dataset contains, for Bulgarian, Croatian, Czech, English, Estonian, Hungarian, Lithuanian, Macedonian, Persian, Polish, Resian, Romanian, Russian, Serbian, Slovak, Slovene, and Ukrainian, some, or all of the following language resources: the MULTEXT-East morphosyntactic specifications, lexica, and annotated "1984" corpus; the MULTEXT-East parallel and comparable text and speech corpora; and associated documentation.</p>