Deniz Zeyrek's home page

Deniz Zeyrek Bozşahin

Professor (at Cognitive Science Program) at Graduate School of Informatics
Middle East Technical University
06800, Çankaya, Ankara

I hold a PhD in linguistics and carry out interdisciplinary research by compiling (electronic) language resources, particularly by recording linguistic data in corpora (this is called linguistic annotation). My research specialties are discourse and pragmatics and their role in understanding human cognition. Recently, I have concentrated on discourse mechanisms trying to understand the role of discourse relations in human languages. I analyze (written) Turkish texts, investigate explicit and implicit ways of signalling discourse relations and try to specify the features that could be important for discourse relations.

I have been the principal developer of Turkish Discourse Bank or TDB (an electronic resource of Turkish annotated for discourse relations in the Penn Discourse TreeBank style), created with the generous support of a TUBITAK (Scientific and Technological Research Council of Turkey) project (No. 107E156) 2007-2011. My corpus development (and linguistic annotation) efforts go hand in hand with my research on discourse mechanisms. Linguistically annotated corpora are ultimately inputs to language technologies and TDB is the platform where our inquiries on Turkish discourse are recorded with the ultimate aim of serving theoretical investigations and future language technology applications.


I lead a research group interested in developing discourse-annotated corpora as inputs to NLP systems and as the basis of linguistic investigations. We have been mainly working on Turkish Discourse Bank to enrich it with more annotations. TDB itself was built with a grant from TÜBİTAK, The Scientific and Technological Research Council of Turkey. You can access my research group Turkish Discourse Bank from here as well.

I was part of Textlink: Structuring Discourse in Multilingual Europe (COST Action IS1312).

I have been involved in developing a multilingual corpus representing several languages in the Textlink Action. This corpus is called TED-Multilingual Discourse Bank, or TED-MDB and currently involves English, Turkish, Portuguese, German, Russian, Polish and Lithuanian.


Google Scholar

Semantic Scholar

Web of Science ResearcherID: M-8082-2017


Research Gate

Detailed List of Activities

METU Research Information System

Contact Information