Software & Data
- SoMaJo – A tokenizer and sentence splitter for German and English web and social media texts.
- SoMeWeTa – A part-of-speech tagger with support for domain adaptation and external resources.
- pandas-association-measures – Statistical Association Measures for co-occurrence dataframes in pandas.
- cwb-ccc – A CWB wrapper to extract concordances and collocates.