• Skip navigation
  • Skip to navigation
  • Skip to the bottom
Simulate organization breadcrumb open Simulate organization breadcrumb close
Logo des Lehrstuhls für Korpus- und Computerlinguistik
  • FAUTo the central FAU website
  1. Friedrich-Alexander-Universität
  2. Philosophische Fakultät und Fachbereich Theologie
  3. Department Germanistik und Komparatistik
Suche öffnen
  • Campo
  • StudOn
  • FAUdir
  • Jobs
  • Map
  • Help
  1. Friedrich-Alexander-Universität
  2. Philosophische Fakultät und Fachbereich Theologie
  3. Department Germanistik und Komparatistik

Logo des Lehrstuhls für Korpus- und Computerlinguistik

Navigation Navigation close
  • Research
    • Methodological foundations of corpus research and digital humanities
    • Corpus tools and language technology
    • Collocations, multiword expressions and corpus-based discourse analysis
    • Further research
    • All publications
    Research
  • Projects
    • RC21
    • PING
    • NormRechts
    • LeAK & AnGer
    • Past Projects
    Projects
  • Resources
    • Corpus Access
    • Web Apps
    • Software & Data
    Resources
  • Teaching
    • Informationen für Erstsemester
    • Rund um den Studiengang
    • Lehrveranstaltungen
    • Oberseminar CL
    • CIP-Pool und Bibliothek
    • FSI Computerlinguistik
    • Arbeiten am Lehrstuhl
    Teaching
  • Team
    • Lead
    • Administrative Office
    • Research Assistants
    Team
  • Blog
  1. Home
  2. Research
  3. Methodological foundations of corpus research and digital humanities

Methodological foundations of corpus research and digital humanities

In page navigation: Research
  • Methodological foundations of corpus research and digital humanities
  • Corpus tools and language technology
  • Collocations, multiword expressions and corpus-based discourse analysis
  • Further research

Methodological foundations of corpus research and digital humanities

Corpus research in linguistics as well as in the digital humanities and social sciences relies on a wide range of statistical techniques and visualizations. A central goal of our research is to develop sound methodological foundations for corpus linguistics, which address key problems in order to ensure that quantitative analyses are both reliable and meaningful.

Research activities

  • Quantitative methodology for literary stylometry (e-Humanities-Zentrum KALLIMACHOS)

Project funding

  • KALLIMACHOS Centre for Digital Humanities: corpus-linguistic approaches and statistical methodology (phase 1), linguistic complexity in literary stylometry (phase 2)
    (10/2014 – 09/2019)
  • Efficient simulation experiments for large-scale parameter optimisation of machine learning approaches in natural language processing
    (10/2016 – 09/2017)

Key publications

  • Evert, Stefan; Proisl, Thomas; Jannidis, Fotis; Reger, Isabella; Pielström, Steffen; Schöch, Christof; Vitt, Thorsten (2017). Understanding and explaining Delta measures for authorship attribution. Digital Scholarship in the Humanities 22(suppl_2), ii4–ii16.
  • Evert, Stefan and Neumann, Stella (2017). The impact of translation direction on characteristics of translated texts. A multivariate analysis for English and German. In G. De Sutter, M.-A. Lefer, and I. Delaere (eds.), Empirical Translation Studies. New Theoretical and Methodological Traditions (TiLSM 300), pages 47–80. Mouton de Gruyter, Berlin.
    ☞  online supplement
  • Evert, Stefan; Wankerl, Sebastian; Nöth, Elmar (2017). Reliable measures of syntactic and lexical complexity: The case of Iris Murdoch. In Proceedings of the Corpus Linguistics 2017 Conference, Birmingham, UK.
  • Evert, Stefan and Arppe, Antti (2015). Some theoretical and experimental observations on naïve discriminative learning. In Proceedings of the 6th Conference on Quantitative Investigations in Theoretical Linguistics (QITL-6), Tübingen, Germany.
  • Baroni, Marco and Evert, Stefan (2007). Words and echoes: Assessing and mitigating the non-randomness problem in word frequency distribution modeling. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL 2007), pages 904–911, Prague, Czech Republic.
  • Evert, Stefan (2006). How random is a corpus? The library metaphor. Zeitschrift für Anglistik und Amerikanistik 54(2), 177–190.

2017

  • Evert, S., & Neumann, S. (2017). The impact of translation direction on characteristics of translated texts. A multivariate analysis for English and German. In De Sutter G, Lefer M, Delaere I (Eds.), Empirical Translation Studies. New Theoretical and Methodological Traditions. (pp. 47-80). Berlin: Mouton de Gruyter.
  • Evert, S., Wankerl, S., & Nöth, E. (2017). Reliable measures of syntactic and lexical complexity: The case of Iris Murdoch. Paper presentation, Birmingham, GB.

2015

  • Evert, S., & Arppe, A. (2015). Some theoretical and experimental observations on naïve discriminative learning. In Proceedings of the 6th Conference on Quantitative Investigations in Theoretical Linguistics (QITL-6). Tübingen, Germany.
  • Evert, S., Proisl, T., Jannidis, F., Pielström, S., Schöch, C., & Vitt, T. (2015). Towards a better understanding of Burrows's Delta in literary authorship attribution. In Proceedings of the Fourth Workshop on Computational Linguistics for Literature (pp. 79--88). Denver, CO.

2014

  • Diwersy, S., Evert, S., & Neumann, S. (2014). A weakly supervised multivariate approach to the study of language variation. In Szmrecsanyi B, Wälchli B (Eds.), Aggregating Dialectology, Typology, and Register Analysis. Linguistic Variation in Text and Speech. (pp. 174–204). Berlin, Boston: De Gruyter.

2007

  • Baroni, M., & Evert, S. (2007). Words and Echoes: Assessing and Mitigating the Non-Randomness Problem in Word Frequency Distribution Modeling. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (pp. 904-911). Prague, Czech Republic.

2006

  • Evert, S. (2006). How Random is a Corpus? The Library Metaphor. Zeitschrift für Anglistik und Amerikanistik, 54(2), 177-190.

Events

  • Open-source course on Statistical Inference – A Gentle Introduction for (Computational) Linguists (LinC 2018, Birmingham 2016, MaLT 2015, Zürich 2010, EMA 2008, DGfS/CL 2007, …)
  • Tutorial / course on Type-Token Distributions & Zipf’s Law (LREC 2018, Birmingham 2018, ESSLLI 2006)
Computational Corpus Linguistics
Prof. Dr. Stephanie Evert

Bismarckstraße 6
91054 Erlangen
Germany
  • Imprint
  • Privacy
  • Accessibility
  • RSS Feed
  • Twitter
  • YouTube
Up