Vortrag: Kengatharaiyer Sarveswaran (29.11.2023)

Im Rahmen des Oberseminars Computerlinguistik findet am 29.11.2023 ein Vortrag statt, zu dem wir herzlich einladen möchten.

 

Vortragender:

Kengatharaiyer Sarveswaran
Herz Fellow, Zukunftskolleg, University of Konstanz.
Faculty member, Department of Computer Science, University of Jaffna. Sri Lanka.

 

Zeit:

Mittwoch, 29.11.2023, 16:15-17:45 Uhr

 

Ort:

Bismarckstr. 12, R.0.320 (in Präsenz) / auch via Zoom (Link folgt über uniinterne Verteiler, externe Anmeldungen gerne über info@linguistik.uni-erlangen.de!)

 

Thema:

Building a Tamil Dependency Treebank

 

Abstract:

Tamil, a Dravidian language, has a history extending back over two millennia and is recognised as one of the world’s oldest living languages. It is spoken by over 80 million people globally and holds official status in Sri Lanka, Singapore, and Tamil Nadu, India. Despite its rich history and cultural significance, Tamil remains computationally a low-resource language. It is deficient in adequate annotated data, benchmark datasets, and linguistic tools. Moreover, its evolution over time in forms and scripts, alongside its status as a diglossic language with complex morphosyntactic features and a free-word order nature, further complicates machine processing. The talk will concentrate on creating a dependency treebank for the Tamil language using the Universal Dependencies framework. Additionally, it will address the various challenges faced during the collection, processing, and annotation of Tamil language data.