International digital library of artifacts inscribed with cuneiform writing

The mission of the Cuneiform Digital Library Initiative (CDLI) is to collect, preserve and make available images, text and metadata of all artifacts inscribed with the cuneiform script. It is the sole project with this mission and we estimate that our 334,000 catalogue entries cover some two-thirds of all sources in collections around the world. Our data are available publicly at and our audiences comprise primarily scholars and students, but with growing numbers of informal learners.

At the heart of cdli is a group of developers, language scientists, machine learning engineers, and cuneiform specialists who develop software infrastructure to process and analyze curated data. To this effect, we are actively developing two projects: Framework Update and Machine Translation and Automated Analysis of Cuneiform Languages As part of these projects we are building a natural language processing platform to empower specialists of ancient languages for undertaking automated annotation and translation of Sumerian language texts thus enabling data driven study of languages, culture, history, economy and politics of the ancient Near Eastern civilizations. As part of this platform we are focusing on data standardization using Linked Open Data to foster best practices in data exchange and integration with other digital humanities and computational philology projects.

lightbulb_outline View ideas list


  • python
  • php
  • mysql
  • java
  • html/css


mail_outline Contact email

Cuneiform Digital Library Initiative (CDLI) 2019 Projects

  • Amaan Iqbal
    CDLI - Search Results Visualizations
    CDLI has rich geographical and temporal data at its disposal. Currently, this information is not fully utilized. Although the data schema is being...
  • Vishal Thamizharasan
    Computer vision challenge for the cuneiform script
    The current display system used at CDLI requires that a user reads a text to absorb visual and text information simultaneously, and to interpret the...
  • Sagar Sagar
    Multiple Layer Annotations Querying
    Currently, there is no tool available to integrate into a website that has the capacity to query through multiple layers of linguistic annotations...
  • Ravneet Punia
    Neural Machine Translation for Sumerian and English
    The project aims to build a machine translation model that can convert Sumerian (Language used around 2000 BC) to English Language using Neural...
  • rillian
    TEI Export for the CDLI Corpus
    I am writing an export tool for the CDLI dataset so it can be used with the Scaife viewer. The tool will need to convert the native AFT markup used...