Vortrag: Nathan Dykes (23.07.2025)
Wir laden herzlich ein zum letzten Vortrag des Oberseminars Computerlinguistik im laufenden Sommersemester am 23.07.2025.
Vortragender: Nathan Dykes (FAU, Department Digital Humanities and Social Studies)
Zeit: Mittwoch, 23.07.2025, 16:15–17:45 Uhr
Ort: CIP-Pool Computerlinguistik, Bismarckstr. 12, Raum 0.320
Thema: „A methodological framework for corpus-based discourse analysis“
Corpus-Assisted Discourse Studies (CADS) uses corpus linguistic approaches to explore how language reflects and constructs social meanings. Over the last two decades, CADS has been increasingly institutionalised, with a growing body of research across geographic areas, topics, and languages. Despite its popularity, a core methodological challenge of CADS is that the research process is often under-operationalised. In particular, few studies have reflected on how methodological decisions shape the linguistic patterns that are found in the corpus. This makes it difficult to compare studies, assess transparency, and build cumulative knowledge across the field.
This contribution addresses that gap by proposing a methodological framework. TILDe (Transparent Interpretation through Linguistic Description) is built around three linguistic levels: lexis, semantics, and lexicogrammar, each offering different entry points into the data. The framework foregrounds how methodological choices influence both what is discovered in the corpus and how findings are interpreted. It also aims to strengthen accountability by aligning analytical decisions with research goals. In addition, the framework centers the role of researcher subjectivity as a central dimension to be acknowledged and accounted for.
Across several case studies, this contribution explores how different corpus methods, such as keywords, collocates, semantic tags, and systematic queries, can be used to identify different kinds of patterns in discourse. A central focus is to compare explorative and targeted strategies on each linguistic level. The former group foregrounds ad-hoc pattern spotting procedures, while the latter focuses more on predefined annotation schemes and structures. By comparing these strategies side-by-side, the thesis shows that different entry points into the corpus are suitable for different types of research goals.