The TUNICO Dictionary

By Ines Dallaji, Ines Gabsi, Stephan Procházka, Veronika Ritt-Benmimoun, Gisela Kitzler, Bettina Leitner, Ines Ben-Brahim

Vienna 2018

The TUNICO Dictionary was created as one deliverable of the project Lexical dynamics in the Greater Tunis area: a corpus based approach, which was supported by the Austrian Science Fund (FWF, P25706-G23; The project was conducted in close cooperation of the Institute of Oriental Studies of the University of Vienna and the Austrian Centre for Digital Humanities of the Austrian Academy of Sciences. Large parts of the research of this project were situated at the crossroads of variational linguistics and language technology.

The dictionary was not only built on data from the corpus of spoken language that was compiled in the same project, but also on two additional sources: data elicited from interviews with young Tunisians and lexicographical material taken from published historical sources dating from the middle of the 20th century and earlier. The most important of these is Hans-Rudolf Singer’s monumental grammar (1984; almost 800 pages) of the Medina of Tunis. Singer’s data was systematically evaluated and integrated into the dictionary, all the material being indicated by reference to the book. Additionally, other resources including (Nicolas 1911, Marçais/Guîga 1958-61, Quéméneur 1962, Abdellatif 2010) were also consulted in order to verify and to complete the contemporary data. The diachronic dimension will help to better understand processes in the development of the lexicon (for more details see Moerth, Prochazka, & Dallaji 2014).

The dictionary can serve as an index to Singer’s grammar. However, we do not claim completeness of the material for the time being.

The project was embedded in the activities of the two large-scale pan-European research infrastructure consortia in the humanities, CLARIN (Common Language Resources and Technology Infrastructure) and DARIAH (Digital Research Infrastructure for the Arts and Humanities). Both infrastructures have grown out of the ESFRI Roadmap and were officially endorsed by the Commission of the European Union after a preparatory phase of several years (Budin, Moerth, & Durco 2013).

The Arabic dialect of Tunis as spoken today by the majority of the city’s inhabitants is a contact variety influenced by a vast population influx from all over Tunisia during the 20th century. It can be regarded as a prestige variety since speakers of other Tunisian Arabic dialects tend to shift towards it. Tunis Arabic is widely used in oral and visual media (theater, film, slogans). It is, however, rarely written except in informal letters, newspaper cartoons or advertising slogans. (By Ines Dallaji)

Through the query interface, you can search for words or groups of words in the dictionary. By simply entering a word such as ʕaṛbi and pressing the ENTER button on your keyboard you will trigger the query. Results matching your query will be displayed below the input field.

Mind that all queries are case sensitive.

The transcription used in the dictionary is for the most part DMG. If you need special characters such as ā, š or ʕ, click on the respective letters in the character table.

The preview option will show you a list of tokens that start with the characters you entered so far.

It is possible to search in particular fields of the dictionary. Wildcards are applied on the token level.

ktbRootsAll entries with the Arabic root ktbTry it!
bookTrans. (English)All entries with an English sense bookTry it!
diminutivesubcAll diminutives Try it!
VIIIsubcAll form VIII verbsTry it!
adverbPOSAll nouns. This will take same time!!!!Try it!
The interface also supports a simple query language.
[root="ktb"]All entries with the root ktbTry it!
[lem="gōl"]All entries with an Arabic token gōlTry it!
[etymLang="English"]All entries with an English etymologyTry it!
You can use more than one query term.
[root="ktb"] & [pos="verb"]All verbs with the root ktbTry it!
[root="ktb"] & [pos="noun"]All nouns with the root ktbTry it!
[root="ktb"] & [pos="verb"] & [subc="I"]All form I verbs with the root ktbTry it!
You can make use of simple regular expressions. Try the following examples:
.*ūniAll fieldsAll entries containing a string ūniTry it!
bal.*ArabicAll entries containing an Arabic lemma token containing a string with balTry it!
kt.*RootsAll entries with an Arabic root starting with ktTry it!
To explicitly anchor a pattern at the head (=beginning) of a token, make use of "^". "$" anchors the pattern at the tail (=end).
^kt.*Rootskt kt at the beginning of a string followed by zero or more characters Try it!Try it!

The interface makes use of the XQuery function matches whenever it detects a regular expression. To get an overview of the options have a look at