The assyriologist Dr Adrian Heinrich presents a clay tablet with cuneiform text from the Hilprecht Collection at the University of Jena (»Frau Professor Hilprecht Collection of Babylonian Antiquities«).

Decrypting cuneiform with AI

In the »Electronic Babylonian Library« project, an Assyriology team is working on enabling AI to read and translate cuneiform texts.
The assyriologist Dr Adrian Heinrich presents a clay tablet with cuneiform text from the Hilprecht Collection at the University of Jena (»Frau Professor Hilprecht Collection of Babylonian Antiquities«).
Image: Anne Günther (University of Jena)

By Stephan Laudien


Cuneiform tablets are the oldest surviving written records of humanity. As early as 4,000 years ago, literate people would score characters in soft clay. Some surviving texts are domestic notices, others pertain to commerce, and some are liturgical and poetic texts. The most famous include the Epic of Gilgamesh and the Code of Hammurabi. These texts were produced in Mesopotamia, a region between the Euphrates and the Tigris. Many of the clay tablets are only the size of a bank card—and yet, they have endured for centuries. Even the destructive force of fire has not damaged them. Quite the opposite, in fact: when fired, clay becomes even more durable.

For a number of reasons, unravelling the mysteries of these cuneiform tablets is painstaking work. For one thing, there is the sheer quantity of these artefacts, with roughly half a million cuneiform objects around the world, according to Dr Adrian Heinrich. The 35-year-old assyriologist is an assistant to Prof. Dr Johannes Hackl at the Institute of Near Eastern Studies, Indo-European Studies and the Archaeology of Prehistory to the Early Middle Ages. The duo is examining the Hilprecht Collection together with its curator, Maria Young. This Jena-based collection comprises around 3,300 artefacts, making it the second-largest in Germany.

A further difficulty is the fact that collections are scattered around the world, which means that artefacts that originally belonged together are often sitting in display cabinets vast distances apart. There are also many broken pieces containing just a handful of cuneiform characters. The third challenge is the limited number of specialists in the field of Ancient Near Eastern studies. Despite this, researchers have been able to decipher cuneiform texts character by character since the mid-19th century.

As Adrian Heinrich explains, artificial intelligence (AI) could significantly accelerate the entire process. The team in Jena is a cooperation partner on the Electronic Babylonian Library project initiated by Munich-based assyriologist Prof. Dr Enrique Jiménez, which aims to train an artificial intelligence to read and translate cuneiform texts. Precise scans that create three-dimensional images of artefacts are fundamental to this work. »We want to bring the perspectives of researchers and collection curators together,« says Heinrich. This would involve digital platforms making cuneiform texts accessible to everyone. The digital copies would also be available to scientists around the world.

Decrypting and reassembling tablet fragments

The first step, however, is to train the AI to read these texts. Hackl’s predecessor, Prof. Manfred Krebernik, began to digitalize cuneiform tablets in cooperation with the Max Planck Institute for the History of Science. Further digital copies are now being produced in cooperation with the Thuringian University and State Library. It is vital to ensure that the tablets are legible from all sides. »Each character can have several meanings, which creates different levels of text,« says Dr Heinrich. This has to be taken into consideration when »feeding« the AI with data.

At present, the AI models are still working on machine translation of training data. Existing translations are also input into the models, with some drawn from the estates of past researchers. As Prof. Hackl explains, an AI similar to one established in the field of genetic research is used for this. »The model searches for certain patterns.« The system learns to link pictorial elements with content. The researchers hope this will enable them to answer even complex search queries regarding entire corpora of text.

This approach has already yielded results. According to Heinrich, the accuracy rate »is currently between 80% and 90%«. Yet, the application of AI promises to deliver another benefit: it will enable researchers to decrypt and reassemble small tablet fragments. This would represent a major step forward for scientific research.