Channel: AWOL - The Ancient World Online
Mark channel Not-Safe-For-Work? cancel confirm NSFW Votes: (0 votes)
Are you the publisher? Claim or contact us about this channel.

Coptic Scriptorium

Coptic Scriptorium
Coptic SCRIPTORIUM (Sahidic Corpus Research: Internet Platform for Interdisciplinary multilayer Methods) is a collaborative, digital project created by Caroline T. Schroeder (University of the Pacific) and Amir Zeldes (Georgetown University). The team is constantly growing.
Coptic SCRIPTORIUM provides a platform for interdisciplinary and computational research in texts in the Coptic language, particularly the Sahidic dialect.  As an open-source, open-access initiative, the SCRIPTORIUM technologies and corpus function as a collaborative environment for digital research by any scholars working in Coptic. It provides:
  • tools to process Coptic texts
  • a searchable, richly-annotated corpus of texts using the ANNIS search and visualization architecture
  • visualizations of Coptic texts
  • a collaborative platform for scholars to use and contribute to the project
  • research results generated from the tools and corpus
We hope SCRIPTORIUM will serve as a model for future digital humanities projects utilizing historical corpora or corpora in languages outside of the Indo-European and Semitic language families.

Acephalous Work 22 by Shenoute

Abraham Our Father by Shenoute

Letters of Besa

Apophthegmata Patrum


Note: This corpus is derived from the Sahidica New Testament, which was released by Warren Wells and made available for free electronic distributionfor academic use only. It is not licensed CC-BY; click here for Sahidica licensing information.


Some of the tools below use a Sahidic Coptic lexicon based on data kindly provided by Prof. Tito Orlandi and the CMCL project. When using the part-of-speech tagging models or the tokenization script and its lexicon please make sure to refer back to the CMCL project.

Part-of-Speech Tagging

Additional Annotation Tools


  • Coptic encoding converter (converts older text character systems used for fonts such as Coptic and Laser Coptic into standards-compliant Coptic Unicode characters)
    • Simple recoding script in Perl (supports CMCL, Laser Coptic and UTF-8 encoding conversion)
    • Converter for ASCII encoding / UTF-8 of Dirk Van Damme and Gregor Wurst
    • Download both converters
  • SaltNPepper - a metamodel based Java framework for multi-format conversion
  • Excel-Plugin for importing and exporting EXMARaLDA XML, SGML, PAULA XML and subsets of TEI XML

Latest Images

Trending Articles