TLA – Introduction

The Thesaurus Linguae Aegyptiae (TLA) is the non-commercial, freely accessible digital publication platform of the Academies’ project “Strukturen und Transformationen des Wortschatzes der ägyptischen Sprache: Text- und Wissenskultur im Alten Ägypten” (“Structure and Transformation in the Vocabulary of the Egyptian Language: Texts and Knowledge in the Culture of Ancient Egypt”). It is funded by the Academies’ Programme of the Union of the German Academies of Sciences and Humanities, and it is executed at the Berlin-Brandenburg Academy of Sciences and Humanities (Berlin) and the Saxon Academy of Sciences and Humanities (Leipzig).

The TLA provides a digital text corpus of lemmatized and annotated ancient Egyptian texts in hieroglyphic, hieratic, and Demotic scripts together with a corresponding set of lemma lists.

The TLA text corpus and its lemma lists make it possible to gather a wide range of information and to address historical, anthropological, and linguistic research questions relating to one of the earliest attested human civilizations and languages. Thanks to the constant work of the project team and worldwide cooperation, the text corpus and lemma lists are constantly increasing, offering an ever-growing anthology of preserved ancient Egyptian texts.

The TLA text corpus and its lemma lists can be used separately, or they can be searched in combination. Search results in the text corpus have the form of parts of texts in Egyptological transliteration on the sentence level, which in turn lead to the cotext up to the full text level (not yet in v2.0, unfortunately). All sentences come with a translation in German, and in some instances there is an English or French translation. Moreover, a growing number of hieroglyphic and hieratic texts also quote the individual words as strings of digitally encoded hieroglyphs (JSesh Manuel de Codage and Unicode). Every lemma list item (lemma) and every text item is annotated with additional metadata, such as bibliographical references, commenting notes, dating, etc.

TLA – History of the TLA

The TLA builds upon lexicographical work on the Ancient Egyptian language conducted under the direction of Adolf Erman at the Königliche Akademie der Wissenschaften zu Berlin starting in 1897, which resulted in the printed Berlin dictionary (A. Erman & H. Grapow, Wörterbuch der aegyptischen Sprache, 5 vols., 1926–1931, seven further supplement volumes until 1961). In 1992, the project “Altägyptisches Wörterbuch” (1992–2012) was inaugurated at the same academy (now: Berlin-Brandenburg Academy of Sciences and Humanities, BBAW). Its new digital agenda was the creation of an electronic lemmatized corpus of Egyptian texts. In 1996, a second team at the Saxon Academy of Sciences and Humanities in Leipzig joined the project.

The first step of the digital agenda was taken in 1999 when the text corpus slips that had been printed, edited and lexicographically arranged in boxes in the decades of work on the original Wörterbuch were digitized and published online as the Digitalisiertes Zettelarchiv (DZA) [des Wörterbuchs der ägyptischen Sprache]. While the printed outcome of the original Wörterbuch project in the early 20th century referenced only about 10% of the collected material in the supplementary Belegstellen volumes (1935–1953), the digitized slip archive offered and still offers free access worldwide to the complete textual basis of the original Wörterbuch project.

The first TLA web application was mainly programmed by the former research coordinator and project leader Stefan Seidlmayer and was put online on October 31, 2004— Adolf Erman’s 150th birthday. Since then, the TLA has become a reliable and powerful research instrument for Egyptologists. Its several search functionalities, such as search for Egyptian words (lemmata), search for word collocations and combinations, and search for hieroglyphic spellings (of lemmata), are designed to help scholars to pursue their philological, linguistic, or historical research interests. By 2014, the year of the last update of this first, now “legacy” TLA, the text corpus had increased to about 1.2 million instances (tokens) of Egyptian lemmata.

Meanwhile the TLA’s editorial system (Berlin Text System, BTS) contains about 1.49 mill. lemma tokens. However, only texts whose authors have agreed to share their contributions under a free license will be published in the present new TLA. The current version of the “new TLA” only offers limited functionality. Therefore, the “legacy TLA” will stay online as long as technically possible, or until the constantly improving functionalities of the “new TLA” make the “legacy TLA” obsolete.

Some prospects for the future

A complete lemma network of the Egyptian-Coptic language

The two already implemented reference lists for lemmatization, the hieroglyphic/hieratic lemma list and the Demotic lemma list, are to be completed by a third component: the Coptic lemma list, including the Greek loan words that became part of the Egyptian lexicon in the 1st millennium CE (see lemma lists).

TLA raw data: JSON and TEI/EpiDoc XML

The texts, lemma lists and thesauri that are edited in the input application BTS will not only be published on the TLA website, they will also be available in a lasting online repository in the form of raw JSON files as well as in the form of Text Encoding Initiative (TEI) XML files.