Channel: AWOL - The Ancient World Online
Viewing all articles
Browse latest Browse all 15049

Lemmatized Ancient Greek Texts

Lemmatized Ancient Greek Texts
This repository contains Ancient Greek texts which have been tokenized, POS-tagged, sentence-splitted, and lemmatized automatically. The texts come from the following repositories, which currently contain most of the Ancient Greek texts freely accessible over the internet:
  1. https://github.com/PerseusDL/canonical-greekLit/releases/tag/0.0.236
  2. https://github.com/OpenGreekAndLatin/First1KGreek/releases/tag/1.1.1802
As for the tokenization, POS tagging and sentence splitting, the data rely on those provided in:
  1. https://github.com/gcelano/POStaggedAncientGreekXML/releases/tag/v1.2.0
Refer to these repositories for further documentation. In the present repository, the POS tag + the word form of a token have been automatically linked to those contained in Morpheus and MorpheusUnderPhilologic. Since the latter databases also contain lemmata, this allowed their automatic extraction.

Viewing all articles
Browse latest Browse all 15049

Trending Articles