Stars
3
Forks
2
Language
None
Last Updated
Aug 28, 2022
Similar Repos
Repo | Language | Stars | Description | Updated At |
---|---|---|---|---|
Shell | 5 | Text corpus the of Tlingit language for linguistic research. | May 21, 2023 | |
Python | 9 | Corpus collection toolset | Apr 06, 2018 | |
Python | 2 | MEETUPS Software for corpus collection | Sep 26, 2022 | |
Python | 8 | The unified corpus building environment for Language Models. | Jan 13, 2022 | |
Jupyter Notebook | 4 | Corpus reader extension for the Classical Language Toolkit | Apr 09, 2023 | |
Python | 6 | Discover archetypes in your text corpus using Watson Natural Language Understanding. | May 15, 2021 | |
None | 14 | Arabic vocalized text corpus | Nov 30, 2022 | |
Perl | 44 | Kyoto University Text Corpus | Apr 26, 2023 | |
HTML | 2 | The NENA corpus in plain-text markup | Jan 21, 2022 | |
HTML | 5 | Corpus collection of the Ethiopian News Headlines. | Apr 24, 2023 | |
JavaScript | 3 | translated italian-language corpus | Jul 14, 2022 | |
None | 44 | MultilingualShareGPT, the free multi-language corpus for LLM training | Apr 12, 2023 | |
Python | 6 | Indonesian corpus for Natural Language Processing | Dec 14, 2019 | |
Python | 14 | Searching in-memory corpus with Corpus Query Language (CQL) | Feb 15, 2023 | |
JavaScript | 11 | Text mining on the Royal Library newspaper corpus | May 24, 2022 | |
Python | 74 | Corpus of auto-labeled text for the cyber security domain | May 14, 2023 | |
None | 2 | The corpus and models for Burmese (Myanmar language) Sentence Tokenization | Apr 30, 2023 | |
Python | 19 | Annotated corpus + evaluation metrics for text anonymisation | Jul 22, 2022 | |
HTML | 2 | Corpus of Raw text for Classical Hindi | Nov 22, 2020 | |
Jupyter Notebook | 5 | Language Model for Historic Dutch (Delpher Corpus) | May 31, 2022 | |
None | 7 | Myanmar Sign Language Corpus for Emergency Domain | Aug 26, 2022 | |
Python | 10 | Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus | Oct 19, 2022 | |
PHP | 7 | Multilingual text corpus integrated with machine-readable dictionary (DICTionary + cORPUS). | Dec 20, 2022 | |
Jupyter Notebook | 7 | A corpus of English-language novels combining the ~250 novels of the Corpus of English Novels … | Dec 30, 2022 | |
Python | 8 | Building and Using A Seed Corpus for the Human Language Project | Apr 30, 2019 | |
None | 2 | A corpus of supersense-annotated adpositions and case markers in German natural-language text. | Jun 06, 2022 | |
HTML | 12 | Multilingual parallel corpus,and tools for preprocessing text | Jul 26, 2022 | |
JavaScript | 37 | A Serverless Text Annotation Tool for Corpus Development | May 05, 2023 | |
None | 16 | Public domain corpus of Catalan text | Feb 26, 2023 | |
Shell | 2 | Language models baseline Kaldi script for TORGO corpus | Apr 21, 2022 | |
JavaScript | 47 | Language-annotated Abstraction and Reasoning Corpus | Apr 24, 2023 | |
None | 20 | Microsoft Speech Language Translation (MSLT) Corpus | Apr 24, 2023 | |
HTML | 43 | R-package for text mining with the Corpus Workbench (CWB) as backend | May 11, 2023 | |
Java | 4 | Code for the GGPOnc corpus - A Corpus of German Medical Text with Rich Metadata … | Sep 30, 2022 | |
Python | 75 | The Definition Extraction From Text corpus and relevant formatting scripts | Jun 02, 2022 | |
Python | 197 | UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language | Jul 21, 2022 | |
None | 3 | A Hmong language corpus derived from the soc.culture.hmong Usenet group | Jan 30, 2023 | |
Python | 6 | Wikipedia text corpus for self-supervised NLP model training | Apr 26, 2022 | |
Python | 10 | A Corpus of Natural Language Instructions for Collaborative Manipulation | Apr 01, 2023 | |
None | 2 | Named Entity Recognition (NER) corpus for Burmese (Myanmar language) | Feb 13, 2023 | |
Python | 2 | Large scale web corpus of Austronesian text. | Jan 17, 2022 | |
Python | 16 | Generate poetry based on text corpus input | Aug 02, 2022 | |
Python | 2 | Pretrained Language Models on British Library Corpus | Aug 07, 2023 | |
None | 4 | The NLP-TAB corpus is a collection of 120 UTF-8 plain text synthetic clinical notes. These … | Apr 12, 2022 | |
Python | 27 | A content-based recommender system for books using the Project Gutenberg text corpus | Apr 03, 2023 | |
Python | 7 | Corpus created from the italki website for use in Native Language Identification tasks | Jul 13, 2022 | |
Go | 5 | Katya or The Liberated Corpus a text corpus that allows you to request and scrape … | Apr 20, 2023 | |
Python | 3 | Canonical Text Services export of the Cuneiform Digital Library Initiative corpus. | Jul 09, 2023 | |
Rust | 4 | Rust library for text feminization using open corpus linguistics data | Nov 23, 2021 | |
JavaScript | 3 | For presentation of audio/video/text corpus of Kofan texts. | Jan 28, 2023 |