|
C++ |
77 |
Corpus preprocessing |
Aug 03, 2022 |
|
PHP |
7 |
Multilingual text corpus integrated with machine-readable dictionary (DICTionary + cORPUS). |
Dec 20, 2022 |
|
Python |
10 |
Script for preprocessing multilingual Markdown. |
Jul 18, 2022 |
|
Python |
6 |
Text preprocessing tools for information extraction |
Jul 30, 2022 |
|
Python |
22 |
Text preprocessing tools in python. |
Jul 25, 2022 |
|
R |
3 |
corpus materials to accompany Text and Character Metrics for Multilingual Comparison (Lindemann & Bowern) |
Nov 30, 2022 |
|
Python |
6 |
Parallel corpus |
Aug 05, 2022 |
|
Python |
1266 |
A multilingual dialog corpus |
Apr 25, 2023 |
|
None |
3 |
A multilingual dialog corpus |
Mar 25, 2021 |
|
Python |
2 |
A multilingual dialog corpus |
Nov 28, 2017 |
|
Python |
7 |
Parallel corpus annotation and visualization |
Feb 03, 2023 |
|
None |
131 |
Korean Parallel Corpus |
Apr 04, 2023 |
|
None |
2 |
Parallel Corpus Search |
Feb 27, 2020 |
|
Python |
7 |
Preprocessing scripts for the Corpus Gesproken Nederlands |
Jan 23, 2022 |
|
Python |
250 |
CoVoST: A Large-Scale Multilingual Speech-To-Text Translation Corpus (CC0 Licensed) |
Jul 28, 2022 |
|
Jupyter Notebook |
2 |
A helper class for facilitating preprocessing of text corpus before any topic modeling algorithms |
Mar 10, 2020 |
|
Jupyter Notebook |
7 |
Multi-way parallel text corpus of 5 key Ugandan languages. |
Apr 19, 2023 |
|
Python |
260 |
ACE 2005 corpus preprocessing for Event Extraction task |
Apr 25, 2023 |
|
Makefile |
4 |
The Open Parallel Corpus |
Feb 18, 2023 |
|
None |
3 |
Thai Lao Parallel corpus |
Dec 28, 2021 |
|
Python |
12 |
ParCourE - Parallel Corpus Explorer |
May 18, 2023 |
|
Assembly |
3 |
The experiment of building parallel voice corpus with freely available parallel text corpora and Google … |
Mar 22, 2023 |
|
Python |
3 |
Text Preprocessing |
Oct 07, 2020 |
|
Python |
12 |
Multilingual text corpus designed to study multilingual and cross-lingual natural language understanding (NLU) models and … |
Jan 31, 2023 |
|
Python |
9 |
Nanyang Technological University - Multilingual Corpus (STB subcorpora) |
Mar 11, 2019 |
|
Python |
63 |
OpusFilter - Parallel corpus processing toolkit |
Aug 13, 2022 |
|
Python |
141 |
A Corpus for Multilingual Document Classification in Eight Languages. |
Jul 29, 2022 |
|
Common Lisp |
11 |
Common Lisp Package for Parallel Corpus Processing |
Sep 15, 2022 |
|
None |
11 |
A monolingual parallel corpus for sentence simplification |
Jun 11, 2022 |
|
C++ |
11 |
Parallel-preprocessor: a prototype of parallel CAE geometry preprocessing framework |
Jul 20, 2022 |
|
Python |
71 |
A large parallel corpus of English and Japanese |
May 07, 2023 |
|
None |
34 |
MaSS - Multilingual corpus of Sentence-aligned Spoken utterances |
Mar 22, 2023 |
|
MATLAB |
5 |
Preprocessing tools for Landsat data. |
Mar 05, 2022 |
|
Python |
93 |
Structural MRI PREProcessing (sMRIPrep) workflows for NIPreps (NeuroImaging PREProcessing tools) |
Apr 24, 2023 |
|
Python |
7 |
Software tools for DICOM preprocessing and NIfTI conversion |
Apr 10, 2023 |
|
Python |
9 |
ASR text preprocessing utility |
Feb 10, 2023 |
|
Python |
19 |
Text Preprocessing in Python |
Apr 22, 2021 |
|
None |
6 |
Parallel corpus mined from IndoWordnet synset gloss and examples |
Jun 22, 2022 |
|
Python |
2 |
text preprocessing library for topic models |
May 05, 2022 |
|
Java |
2 |
twitter-corpus-tools |
Jan 28, 2023 |
|
Python |
19 |
Amharic English Machine Translation Corpus prepared through website crawelling and custom preprocessing. |
Jul 17, 2022 |
|
None |
110 |
CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus |
Oct 12, 2022 |
|
Python |
2 |
Preprocessing and downloading scripts for the Santa Barbara Corpus of Spoken American English (SBCSAE). |
Feb 13, 2023 |
|
MATLAB |
70 |
Contains tools for EEG standardized preprocessing |
May 29, 2022 |
|
Jupyter Notebook |
131 |
Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus |
May 06, 2023 |
|
Jupyter Notebook |
5 |
The IIT Bombay English-Hindi Parallel Corpus |
Mar 18, 2022 |
|
Jupyter Notebook |
2 |
parallel corpus dataset from the mnbvc project |
Mar 10, 2023 |
|
Go |
2 |
Multilingual Unicode Text Segmentation for IR |
Dec 05, 2017 |
|
HTML |
5 |
Resources and tools for Parallel Chineses |
Oct 31, 2018 |
|
Python |
118 |
Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a … |
Oct 08, 2022 |