The corpus contains all newsletters from arthist.net from 2001 to 2022. A automatic language detection on sentence level
has been made.
...
## Publications
\ No newline at end of file
# ArtHist v2
## Grunddaten
* URL: <https://korpuspragmatik.ds.uzh.ch/korpora/arthistv2/>
* Name: Arthist Mailinglist V2 (2023-02)
* Sources: ArtHist.net Mailingliste https://arthist.net
* Date Range: 2001 (January) to 2022 (December)
* Creators: Niclas Bodenmann, Xenia Bojarski, Noah Bubenhofer, Daniel Burckhardt (Datencrawling)
* Funding: Tristan Weddigen, Noah Bubenhofer
* Usage Rights: CC BY-SA 3.0
## Corpus Metadata
* https://korpuspragmatik.ds.uzh.ch/korpora/arthistv2/index.php?thisQ=corpusMetadata&uT=y
## Annotation
<https://korpuspragmatik.ds.uzh.ch/korpora/arthistv2/index.php?thisQ=corpusMetadata&uT=y>
## Short Description
The corpus contains all newsletters from arthist.net from 2001 to 2022. An automatic language detection on sentence level has been made.
This enables POS-tagging for the different languages.
The tags are tagged with the Universal Tagset (UPOS).
Named entities are tagged with different detail depending on the language which can result in different tags.
The corpus consists of texts of the following languages and percentages:
46 % English
44 % German
6 % French
2 % Italian
2 % Spanish
0.2 % Portugese
The corpus contains 20,183,463 tokens in 31,190 texts.
Pictures are not included in this corpus.
## Publications