Language |
Aranea Corpora |
Minus 125 M |
Maius 1.25 G |
Maximum |
Arabic (not tagged yet) |
Araneum Arabicum |
|
|
978 M
*
|
Bulgarian |
Araneum Bulgaricum |
|
|
|
Chinese (simplified script) |
Araneum Sinicum |
|
|
|
Czech |
Araneum Bohemicum IV |
|
|
7.10 G
|
Danish |
Araneum Danicum Beta |
|
|
|
Dutch |
Araneum Nederlandicum |
|
|
|
English |
Araneum Anglicum II |
|
|
11.4 G
|
English (Africa) |
Araneum Anglicum Africanum |
|
|
|
English (Asia) |
Araneum Anglicum Asiaticum |
|
|
|
Estonian |
Araneum Estonicum II |
|
|
|
Finnish |
Araneum Finnicum |
|
|
|
French |
Araneum Francogallicum III |
|
|
10.9 G
|
French (France) |
Araneum Francogallicum Gallicum |
|
|
3.29 G
|
French (Belgium) |
Araneum Francogallicum Belgicum |
|
|
365 M
*
|
French (Canada) |
Araneum Francogallicum Canadiense II |
|
|
406 M
*
|
French (Switzerland) |
Araneum Francogallicum Helveticum |
|
|
229 M
*
|
French (Africa) |
Araneum Francogallicum Africanum II |
|
|
310 M
*
|
Georgian |
Araneum Georgianum III Beta |
|
|
1.19 G
*
|
German |
Araneum Germanicum III |
|
|
8.91 G
|
German (Germany) |
Araneum Germanicum Germanicum |
|
|
5.59 G
|
German (Austria) |
Araneum Germanicum Austriacum |
|
|
441 M
*
|
German (Switzerland) |
Araneum Germanicum Helveticum |
|
|
381 M
*
|
Hungarian |
Araneum Hungaricum |
|
|
|
Italian |
Araneum Italicum |
|
|
|
Latin |
Araneum Latinum |
|
|
109 M
*
|
Latvian |
Araneum Lettonicum |
|
|
671 M
*
|
Norwegian |
Araneum Norvegicum II Beta |
|
|
3.53 G
|
Persian |
Araneum Persicum Beta |
|
|
3.09 G
|
Polish |
Araneum Polonicum |
|
|
|
Portuguese |
Araneum Portugallicum |
|
|
|
Romanian |
Araneum Dacoromanicum |
|
|
|
Russian |
Araneum Russicum III |
|
|
19.8 G
|
Russian (Russia) |
Araneum Russicum Russicum |
|
|
|
Russian (non-Russia) |
Araneum Russicum Externum |
|
|
|
Slovak |
Araneum Slovacum VII |
|
|
5.30 G
|
Spanish |
Araneum Hispanicum |
|
|
|
Swedish |
Araneum Suedicum |
|
|
|
Ukrainian |
Araneum Ucrainicum Beta |
|
|
|
Uzbek |
Araneum Uzbecicum |
|
|
|
Language |
Other Corpora |
Minus 120 M |
Maius 1.20 G |
Maximum |
Arabic (not tagged yet) |
Ajdir Arabicum |
|
|
50.0 M
*
|
Croatian |
Zagrabia Croatica (hrWaC) |
|
|
|
Russian |
Taiga Russica |
|
|
4.44 G
|
Russian |
Vicipaedia Russica |
|
|
476 M
|
Slovak |
Omnia Slovaca Publica II |
|
|
4.34 G
|
Slovene |
Aemona Slovena (ccGigafida) |
|
|
|
* Parvum (< 125 M) and Medium (< 1,25 G) class corpora are only available for some languages.