AI model is solving 4,500-year-old Cuneiform translation mysteries
Top image: Cuneiform tablet from Van fortress, Turkey
Archaeologists just got infinitely smarter. AI is now successfully predicting lost passages of ancient texts, meaning 4,500-year-old cuneiform tablets, with missing sections, can now read.
A deep-learning artificial intelligence (AI) engine has been built that can predict what “should” come next, in broken lines of text, just like when you punch a word into a search engine and the auto-predict function kicks in.
Mesopotamia, in the Fertile Crescent , is one of the world's oldest known cradles of civilization. Encompassing what is present-day Iraq, as well as parts of Iran, Turkey, Syria and Kuwait, from Mesopotamia grew the Sumerian, Assyrian and Babylonian empires. Cuneiform emerged in early Bronze Age and this logo-syllabic script was used in many ancient languages of the Ancient Near East right up to the beginning of the Common Era. The language features characteristic wedge-shaped impressions that form its signs, but most of the examples from the ancient world exist on severely damaged, partial clay tablets.
A team of scientists from Jerusalem's Hebrew University have now made a major breakthrough with a new AI model. The authors wrote in a new paper which will be formally presented in November at the Conference on Empirical Methods in Natural Language Processing, that scripts from “10,000 cuneiform tablets dating from 2500 BC to 100 AD” were fed into the new AI program. Known as “the Babylonian Engine” the new AI model successfully predicted a string of missing words, phrases and sentences accurate to 90%.
Looking Into The Past, To Make Predictions
Several Mesopotamian civilizations including the Babylonians and Assyrians spoke Akkadian, the oldest known Semitic language. Cuneiform was the written form of that language that adopted distinctive wedge-shaped characters. Co-author of the new paper, Gabriel Stanovsky, a computer scientist at the Hebrew University, told New Scientist that clay tablets are the main record from the Mesopotamian cultures, “including religious texts, bureaucratic records, royal decrees, and more.” This is why they are the “target of extensive transcription and transliteration efforts” added the computer scientist.
Artificial intelligence is a machine version of human intelligence where neurons are replaced by computer systems. AI has recently become an expert processor of complicated language processing systems. In this case, the AI, consumed transcriptions of 10,000 cuneiform tablets and it was taught how to read 104 languages, according to a report in Daily Mail .
The AI Model Is Not A New Generation Of Digital Archaeologists
Shia Gordin, director of the Digital Humanities Ariel Lab at Israel's Ariel University, wrote in an earlier release that traditionally, archaeologists and language specialists slogged over photographs of clay tablets in what Gordin describes as very “subjective and time-consuming” hard work. The manual method is becoming ever more difficult as many tablets have now deteriorated so much that researchers depend on “contextual cues to manually fill in missing text,” wrote Gordin.
Professor Gordin explained that the team of researchers used a model that was already trained on Semitic languages including Hebrew, which were all similar to Akkadian. They then tested the system first by hiding existing parts of interpreted tablets, and the model predicted the missing words to “89 percent accuracy.” Thus, the Babylonian Engine was used to fill in the gaps on ancient Persian tablets dating between the sixth and fourth centuries BC.
Stanovsky told New Scientist that the new AI program “is not a replacement for human experts” but more of an “assistive tool.” The human touch will always be needed to contextualize the tablet, considering variables such as where and when a particular tablet was discovered. And so accurate is the new system that in a statement from November last year, Gordin said historians with less formal training in Akkadian can enter Akkadian text and get results which “are citable in their research and publications”.