• Skip navigation
  • Skip to navigation
  • Skip to the bottom
Simulate organization breadcrumb open Simulate organization breadcrumb close
Logo des Lehrstuhls für Korpus- und Computerlinguistik
  • FAUTo the central FAU website
  1. Friedrich-Alexander-Universität
  2. Philosophische Fakultät und Fachbereich Theologie
  3. Department Germanistik und Komparatistik
Suche öffnen
  • Campo
  • StudOn
  • FAUdir
  • Jobs
  • Map
  • Help
  1. Friedrich-Alexander-Universität
  2. Philosophische Fakultät und Fachbereich Theologie
  3. Department Germanistik und Komparatistik

Logo des Lehrstuhls für Korpus- und Computerlinguistik

Navigation Navigation close
  • Research
    • Methodological foundations of corpus research and digital humanities
    • Corpus tools and language technology
    • Collocations, multiword expressions and corpus-based discourse analysis
    • Further research
    • All publications
    Research
  • Projects
    • RC21
    • PING
    • NormRechts
    • LeAK & AnGer
    • Past Projects
    Projects
  • Resources
    • Corpus Access
    • Web Apps
    • Software & Data
    Resources
  • Teaching
    • Informationen für Erstsemester
    • Rund um den Studiengang
    • Lehrveranstaltungen
    • Oberseminar CL
    • CIP-Pool und Bibliothek
    • FSI Computerlinguistik
    • Arbeiten am Lehrstuhl
    Teaching
  • Team
    • Lead
    • Administrative Office
    • Research Assistants
    Team
  • Blog
  1. Home
  2. Projects
  3. RC21

RC21

In page navigation: Projects
  • RC21
  • LeAK & AnGer
  • NormRechts
  • PING
  • Past Projects

RC21

Reading concordances in the 21st century (RC21)

Joint project of FAU Erlangen-Nürnberg and the University of Birmingham

 

Project leaders: Stephanie Evert, Michaela Mahlberg

Project members: Alexander Piperski, Nathan Dykes

Start date: 2023-02-23

End date: 2026-03-31

Acronym: RC21

Funding source: Deutsche Forschungsgemeinschaft (project no. 508235423), Arts and Humanities Research Council (project no. AH/X002047/1)

Project website

Bluesky

 

Abstract:

In today’s digital world, the amount of text communicated in electronic form is ever-increasing and there is a growing need for approaches and methods to extract meanings from texts at scale. Corpus linguists have long been studying digitised texts and have established that much of language is characterised by recurring patterns. So the word ‘eye’ can appear together with words like ‘cream’ and ‘test’, or words like ‘closed’ and ‘fixed’. In corpus linguistics, such patterns are identified with the help of concordances, i.e. displays that show many occurrences of a word, phrase or construction across a range of contexts in a compact format. However, lacking a well-established and clear-cut methodology, the art of reading concordances has not yet realised its full potential. At the same time, there has been very little innovation in algorithms in the concordance software packages available to corpus linguists.This project proposes an innovative approach to reading concordances in the 21st century. Through the collaboration between the University of Birmingham and FAU Erlangen-Nürnberg we combine strengths in theoretical work in corpus linguistics with expertise in computational algorithms in order to develop a systematic methodology for reading concordances. We will develop tool-independent strategies and corresponding algorithms for the semi-automatic organisation of concordance lines, and implement them in the software FlexiConc. To develop and test our approach, we will conduct two case studies. The first will focus on body language in fiction compared to non-fiction texts. The second will focus on political argumentation in social media, formalising its findings as corpus queries that can be used for automatic argumentation mining. Both case studies include a comparative dimension between English and German. Hence, they broaden out approaches to concordance reading which have been very focused on the English language so far. Through these case studies, we will establish an approach that not only provides innovation in corpus linguistics, but also has wider implications for the analysis of textual data at scale, while still retaining a humanities perspective.We will develop FlexiConc as open-source software, so that other researchers can use it as an off-the-shelf tool or integrate it into existing concordance tools or their own software environment. Both FlexiConc and our tool-independent approach to concordance analysis will have relevance beyond corpus linguistics, providing innovative approaches and algorithms for disciplines such as digital humanities and computational social science. We will raise awareness of the new possibilities in a variety of forms, for instance, through a project blog where users of our software can share their experience, and with the help of an advisory board of leading international experts. We will run training sessions at summer schools and conferences and make educational materials available online.

 

Conference presentations:

Stephanie Evert, Natalie Finlayson, Michaela Mahlberg, and Alexander Piperski. Computer-assisted concordance reading. Corpus Linguistics 2023 Conference (03–06 July 2023, Lancaster)

Computational Corpus Linguistics
Prof. Dr. Stephanie Evert

Bismarckstraße 6
91054 Erlangen
Germany
  • Imprint
  • Privacy
  • Accessibility
  • RSS Feed
  • Twitter
  • YouTube
Up