Comenius University in Bratislava
UNESCO Chair in Plurilingual and Multicultural Communication

Aranea Project Main Site powered by NoSketch Engine (Guest Access)     

Free registration is required for work with the Maius and Maximum class of corpora.
To register, please fill in and submit this form.

Language Aranea Corpora Minus
125 M
Maius
1.25 G
Maximum
Arabic (not tagged yet) Araneum Arabicum  978 M *
Bulgarian Araneum Bulgaricum
Chinese (simplified script) Araneum Sinicum
Czech Araneum Bohemicum IV  7.10 G
Danish Araneum Danicum Beta
Dutch Araneum Nederlandicum
English Araneum Anglicum II  11.4 G
  English (Africa)   Araneum Anglicum Africanum
  English (Asia)   Araneum Anglicum Asiaticum
Estonian Araneum Estonicum II
Finnish Araneum Finnicum
French Araneum Francogallicum III  10.9 G
  French (France)   Araneum Francogallicum Gallicum  3.29 G
  French (Belgium)   Araneum Francogallicum Belgicum  365 M *
  French (Canada)   Araneum Francogallicum Canadiense II    406 M *
  French (Switzerland)   Araneum Francogallicum Helveticum  229 M *
  French (Africa)   Araneum Francogallicum Africanum II  310 M *
Georgian Araneum Georgianum II Beta  864 M *
German Araneum Germanicum III  8.91 G
  German (Germany)   Araneum Germanicum Germanicum  5.59 G
  German (Austria)   Araneum Germanicum Austriacum  441 M *
  German (Switzerland)   Araneum Germanicum Helveticum  381 M *
Hungarian Araneum Hungaricum
Italian Araneum Italicum
Latin Araneum Latinum  109 M *
Latvian Araneum Lettonicum  671 M *
Norwegian Araneum Norvegicum II Beta  3.53 G
Persian Araneum Persicum Beta  3.09 G
Polish Araneum Polonicum
Portuguese Araneum Portugallicum
Romanian Araneum Dacoromanicum
Russian Araneum Russicum III  19.8 G
  Russian (Russia)   Araneum Russicum Russicum
  Russian (non-Russia)   Araneum Russicum Externum
Slovak Araneum Slovacum VII  5.30 G
Spanish Araneum Hispanicum
Swedish Araneum Suedicum
Ukrainian Araneum Ucrainicum Beta
Uzbek Araneum Uzbecicum
Language Other Corpora Minus
120 M
Maius
1.20 G
Maximum
Arabic (not tagged yet) Ajdir Arabicum  50.0 M *
Croatian Zagrabia Croatica (hrWaC)
Russian Taiga Russica  4.44 G
Russian Vicipaedia Russica  476 M
Slovak Omnia Slovaca Publica II  4.34 G
Slovene Aemona Slovena (ccGigafida)
* Parvum (< 125 M) and Medium (< 1,25 G) class corpora are only available for some languages.

News

17 January 2024
Araneum Slovacum VII (24.02) released
13 March 2023
Araneum Georgianum II Beta (23.02) released
18 February 2023
Araneum Norvegicum II Beta (23.02) released
13 February 2023
Araneum Danicum Beta (23.02) released
21 October 2022
Araneum Persicum Beta (22.10bis) released
14 August 2022
Araneum Ucrainicum Beta (22.08) released
21 Jun 2020
New Aranea Germanica (20.06) released
22 May 2020
New Aranea Francogallica (20.05) released
30 March 2020
Araneum Bohemicum IV (20.03) released


Useful links

Araneum Universal Tagset (AUT)

Corpus Query Language (CQL)
Sketch Engine Documentation


Feedback

For questions and comments, please use
this form