Deniz Zeyrek Bozşahin

Professor (at Cognitive Science Program) and Director of Graduate School of Informatics
Middle East Technical University
06800, Çankaya, Ankara
Discourse Annotation Tool for

I hold a PhD in linguistics and carry out interdisciplinary research by compiling (electronic) language resources, particularly by recording linguistic data in corpora (this is called linguistic annotation). My research specialties are discourse and pragmatics and their role in understanding human cognition. Recently, I have concentrated on discourse mechanisms trying to understand the role of discourse relations in human languages. I analyze (written) Turkish texts, investigate explicit and implicit ways of signalling discourse relations and try to specify the features that could be important for discourse relations.

I have been the principal developer of Turkish Discourse Bank or TDB (an electronic resource of Turkish annotated for discourse relations in the Penn Discourse TreeBank style), created with the generous support of a TUBITAK (Scientific and Technological Research Council of Turkey) project (No. 107E156) 2007-2011. My corpus development (and linguistic annotation) efforts go hand in hand with my research on discourse mechanisms. Linguistically annotated corpora are ultimately inputs to language technologies and TDB is the platform where our inquiries on Turkish discourse are recorded with the ultimate aim of serving theoretical investigations and future language technology applications.

Selected publications

The following is a selection of my recent publications (journal articles, book chapters and articles in conference proceedings) from 2010 onwards. They represent my work in the research areas mentioned on this webpage. Also see my institutional profile in Graduate School of Informatics, METU.


Zeyrek, D. TED Multilingual Discourse Bank (TED-MDB): A parallel annotated in the PDTB style. 11th Linguistic Annotation Workshop (LAW) , European Chapter of the Association of Computational Linguistics. April 3rd, 2017, Valencia.

Zeyrek, D. & Kurfalı, M. TDB 1.1: Extensions on Turkish Discourse Bank. Proc., 11th Linguistic Annotation Workshop (LAW) European Chapter of Association of Computational Linguistics. 3-4 2017, Valencia, Spain.

Demirşahin, I. & Zeyrek, D. (2017). Pair Annotation as a Novel Annotation Procedure: The Case of Turkish Discourse Bank. In Nancy Ide & James Pustejovsky (Eds.) Handbook of Linguistic Annotation. Springer.


Kaygusuz, Y. & Zeyrek, D. (2016). Turkish children's early vocabulary: A study on the lexical diversity of two sisters. B. Haznedar & R. N. Ketez (Eds.) The Acquisition of Turkish in Childhood. Trends in Language Acquisition Research, 20. (pp. 57-78).

Tolgay, A. E., Zeyrek, D., Kurfalı, M., Bozşahin, C. (2016). A Turkish database for psycholinguistic studies based on frequency, age of acquisition, and imageability. LREC 2016, 10th edition of the Language Resources and Evaluation Conference, 23-28 May 2016, Portorož (Slovenia).

Kurfalı, M., Zeyrek, D., Gonçalves, T. (2016) Automatic prediction of implicit discourse relations in Turkish. Conference Handbook: Structuring Discourse in Multilingual Europe Second Action Conference. Károli Gáspár University of the Reformed Church in Hungary Budapest, 11 –14 April, 2016 (pp. 65-70).


Zeyrek, D., Sağın-Şimşek, Ç., Ataş, U., Rehbein, J. (Eds) (2015). Ankara Papers in Turkish and Turkic Linguistics. Wiesbaden: Harrasowitz Verlag.

Zeyrek, D., Demirşahin, I., Sevdik-Çallı, A. B., Kurfalı, M. (2015). Annotating implicit discourse relations in Turkish: The challenge of corrective discourse relations. Paper presented at the workshop Discourse connectives across languages and modes: Challenges for discourse annotation, organized by Sandrine Zufferey, Liesbeth Degand & Daniel Hardt. 14th International Pragmatics Association (IPRA) Conference. Antwerp, Belgium 26-31 July, 2015. Abstracts (pp. 455-456).


Zeyrek, D. (2014). On the distribution of the contrastive-concessive discourse connectives ama 'but/yet ' and fakat 'but ' in written Turkish. Suihkonen, P., & Whaley, L. J. (Eds.). (2014). On Diversity and Complexity of Languages Spoken in Europe and North and Central Asia (Vol. 164). John Benjamins Publishing Company (pp. 251-275).

Zeyrek, D. & Acartürk, C. (2014). The distinction between unaccusative and unergative verbs in Turkish: an offline and an eyetracking study of split intransitivity. Cogsci 2014 Proceedings (pp.1832-1837).

Demirşahin, I. & Zeyrek, D. (2014). Annotating discourse connectives in spoken Turkish. LAW VIII - The 8th Linguistic Annotation Workshop, Dublin, Ireland, August 23-24, 2014 (pp. 105-109).

Erten, B., Bozşahin, C., Zeyrek, D. (2014). Turkish resources for visual word recognition. LREC 2014,The 9th edition of the Language Resources and Evaluation Conference, 26-31 May, Reykjavik, Iceland (pp. 2106-2110).


Zeyrek, D., Demirşahin, I., Sevdik-Çallı, Ayışığı, B., Çakıcı, R. (2013). Turkish Discourse Bank: Porting a discourse annotation style to a morphologically rich language. Dialogue & Discourse. Vol. 4, No. 2: 174-184.

Demirşahin, I., Öztürel, A., Bozşahin, C., Zeyrek, D. (2013). Applicative structures and immediate discourse in the Turkish Discourse Bank. Proc. of the 7th Linguistic Annotation Workshop & Interoperability with Discourse (pages 122-130). Aug. 8-9 2013, Sofia, Bulgaria.

Zeyrek, D. (2013) The discourse connective yerine 'instead ' in Turkish (13th International Pragmatics Conference, 8-13 September, 2013, New Delhi, India).


Zeyrek, D., Turan Ü. D., Demirşahin I., & Çakıcı R. (2012). Differential properties of three discourse connectives in Turkish: A corpus-based analysis of Fakat, Yoksa, Ayrıca. In A. Benz, M. Stede, & P. Kühnlein (eds.). Constraints in Discourse 3. Representing and inferring discourse structure. (pp. 183-206) John Benjamins Publishing Company.

Demirşahin, Işın, Yalçınkaya İ., & Zeyrek D. (2012). Pair annotation: Adaption of Pair Programming to Corpus Annotation. Proceedings. Association of Computational Linguistics, Proceedings of the Sixth Linguistic Annotation Workshop (pp. 31-39).

Göy, E., Zeyrek D., & Otcu B. (2012). Developmental Patterns in Internal Modification Use in Requests: A Quantitative Study on Turkish Learners of English.In Helen Woodfield & Maria Kogetsidis (Eds.). Interlanguage Request Modification (pp. 51-87). John Benjamins Publishing Company.

Şirin, U., Çakıcı R., & Zeyrek D. (2012). METU Turkish Discourse Bank Browser. Proceedings, The 8th edition of the Language Resources and Evaluation Conference (LREC), 21-27 May, 2012, Istanbul, Turkey.

Demirşahin, Işın, Çallı A. B. S., Balaban H. Ö., Çakıcı R., & Zeyrek D. (2012). Turkish Discourse Bank: Ongoing Developments. Proceedings, Workshop on Turkic Languages. The 8th edition of the Language Resources and Evaluation Conference (LREC), 21-27 May, 2012, Istanbul, Turkey.

Zeyrek, D. (2012). Thanking in Turkish: a Corpus-based Analysis. In Leire Luiz Zarobe and Yolanda Luiz Zarobe (Eds.). Speech Acts and Politeness across Languages and Cultures. Speech Acts and Politeness across Languages and Cultures (pp. 27-52). Peter Lang.


Turan, Ü. D., & Zeyrek D. (2011). Context, contrast, and the structure of discourse in Turkish. In Anita Fetzer & Etsuko Oishi (Eds.). Context and contexts: Parts meets whole? (pp. 147-170). John Benjamins.


Zeyrek, D., Demirşahin Işın, Çallı A. B. S., Balaban H. Ö., Yalçınkaya İ., & Turan Ü. D. (2010). The annotation scheme of the Turkish Discourse Bank and an evaluation of inconsistent annotations. Proceedings of the Fourth Linguistic Annotation Workshop (pp. 282–289). 15–16 July 2010, Uppsala, Sweden.

Aktaş, B., Bozşahin C., & Zeyrek D. (2010). Discourse Relation Configurations in Turkish and an Annotation Environment. ACL 2010. LAW IV. Fourth Linguistic Annotation Workshop.Proceedings of the Workshop (pp. 202-206).15-16 July 2010, Uppsala, Sweden.

Akgün, M., Çağıltay K., & Zeyrek D. (2010). The effect of apologetic error messages and mood stated on computer users' self-appraisal of performance. Journal of Pragmatics. 42(9), 2349-2448.

Acartürk, C., & Zeyrek D. (2010). Unaccusative/Unergative Distinction in Turkish: A Connectionist Approach. The 23rd International Conference on Computational Linguistics. Proceedings of the 8th Workshop on Asian Language Resources


I lead a research group interested in developing discourse-annotated corpora as inputs to NLP systems and as the basis of linguistic investigations. We have been mainly working on Turkish Discourse Bank to enrich it with more annotations. TDB itself was built with a grant from TÜBİTAK, The Scientific and Technological Research Council of Turkey (2008-2011), where I was the primary investigator. You can access my research group Turkish Discourse Bank from here as well.

I am part of Textlink: Structuring Discourse in Multilingual Europe (COST Action IS1312), where I act as the Communication Manager and as a member of the Steering Committee.

Recently, I have been involved in developing a multilingual corpus representing several languages in the Textlink Action. This corpus is called TED-Multilingual Discourse Bank, or TED-MDB and currently involves English, Turkish, Portuguese, German, Russian and Polish. This effort is in progress.


I usually teach Cogs 541 Language Acquisition and Cogs 528 Discourse Mechanisms.

In February 2017, I taught at Textlink Training School in Prague.


I supervise master's and PhD research projects on the topics described under Research Interests above; I welcome thesis propososals on those or on related areas.

Here are the most recent master 's and doctoral theses I supervised:


Assessment of the Turkish Discourse Bank and a Cascaded Model to Automatically Identify Discursive Phrasal Expressions in Turkish by Ayışığı B. Sevdik-Çallı (2015)


Automatic Sense Prediction of Implicit Discourse Relations in Turkish, by Murathan Kurfalı (2016) METU, Cognitive Science.

A Turkish Database for Psycholinguistic Studies: A Corpus Based Study on Frequency, Age of Acquisition and Imageability by Elif Ahsen Tolgay (2015), METU, Cognitive Science.

Discovering the Discourse Role of Converbs in Turkish, by Ahmet Faruk Acar (2014), METU, Cognitive Science.

Scientific committees in conferences

I was involved in the program committee or scientific committe of various interdisciplinary conferences and workshops, for example, Textlink - Structuring Discourse in Multilingual Europe: Second Action Conference, 2016; ISBCS 2014, 2015 (1st/2nd International Symposium on Brain and Cognitive Science); Language Resources and Evaluation Conference (LREC) 2012, 2014, 2015, 2016; Language and Resources area of Association of Computational Linguistics Conference (ACL) 2013, 2014, 2015, 2016. I also serve in the program committees of International Conference of Turkish Linguistics (ICTL) and National Turkish Linguistics Conferences (Ulusal Türk Dilbilim Kurultayı).

I was one of the organizers of 16th International Conference of Turkish Linguistics in 2012 (ICTL), and LAEL (Linguistic Approaches to Endangered Languages: Theory and Description) in 2015.

I am the Editor of Dilbilim Araştırmaları Dergisi/Journal of Linguistic Research, one of the oldest peer-reviewed journals of linguistics in Turkey publishing research articles (primarily) on Turkish.


T.E.D. Ankara Koleji (Lycee diploma)
Hacettepe University, Department of Linguistics (BA)
Kansas University, Department of Linguistics (MA)
Hacettepe University, Department of Linguistics (PhD)