|
Jupyter Notebook |
5 |
The IIT Bombay English-Hindi Parallel Corpus |
Mar 18, 2022 |
|
Python |
71 |
A large parallel corpus of English and Japanese |
May 07, 2023 |
|
Python |
14 |
The New York Times English-Chinese parallel corpus |
May 02, 2023 |
|
Java |
4 |
Cantonese Chinese English Dictionary |
Jul 20, 2022 |
|
Python |
6 |
Parallel corpus |
Aug 05, 2022 |
|
None |
7 |
English words list extracted from Wikipedia articles |
Aug 02, 2022 |
|
Python |
25 |
A English-to-Cantonese machine translation model |
Apr 11, 2023 |
|
Jupyter Notebook |
2 |
parallel corpus dataset from the mnbvc project |
Mar 10, 2023 |
|
Python |
10 |
Scripts for creating a Japanese-English parallel corpus and training NMT models |
May 23, 2023 |
|
None |
131 |
Korean Parallel Corpus |
Apr 04, 2023 |
|
None |
2 |
Parallel Corpus Search |
Feb 27, 2020 |
|
None |
2 |
Cross-language English-Arabic corpus derived from WikiQA |
May 29, 2023 |
|
Makefile |
4 |
The Open Parallel Corpus |
Feb 18, 2023 |
|
None |
3 |
Thai Lao Parallel corpus |
Dec 28, 2021 |
|
Python |
12 |
ParCourE - Parallel Corpus Explorer |
May 18, 2023 |
|
Perl |
2 |
Texts from Corpus of Middle English Prose and Verse |
Feb 15, 2022 |
|
None |
6 |
Parallel corpus mined from IndoWordnet synset gloss and examples |
Jun 22, 2022 |
|
None |
2 |
Small Japanese-English Subtitle Corpus |
Feb 26, 2023 |
|
Jupyter Notebook |
7 |
A corpus of English-language novels combining the ~250 novels of the Corpus of English Novels … |
Dec 30, 2022 |
|
Python |
63 |
OpusFilter - Parallel corpus processing toolkit |
Aug 13, 2022 |
|
Python |
7 |
Parallel corpus annotation and visualization |
Feb 03, 2023 |
|
Python |
2 |
English-Korean Parallel Dataset |
Apr 21, 2022 |
|
None |
3 |
corpus of English and Frech collocations |
May 14, 2022 |
|
Shell |
8 |
English conversation corpus for conversational TTS. |
Apr 02, 2023 |
|
None |
35 |
Xlit-Crowd: Hindi-English Transliteration Corpus |
May 19, 2023 |
|
None |
20 |
Spoken Cantonese from Hong Kong. |
Aug 11, 2022 |
|
None |
3 |
Cherokee English Corpus material for NLP research |
Jan 12, 2022 |
|
Common Lisp |
11 |
Common Lisp Package for Parallel Corpus Processing |
Sep 15, 2022 |
|
None |
11 |
A monolingual parallel corpus for sentence simplification |
Jun 11, 2022 |
|
Python |
118 |
Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a … |
Oct 08, 2022 |
|
Python |
46 |
a neural machine translation system from english (chinese) to chinese (english) based on 30m parallel … |
Aug 09, 2022 |
|
HTML |
77 |
Text corpus calculation in Javascript. Supports Chinese, English. |
Apr 28, 2023 |
|
Python |
1123 |
Transformer seq2seq model, program that can build a language translator from parallel corpus |
Jun 01, 2023 |
|
HTML |
12 |
Multilingual parallel corpus,and tools for preprocessing text |
Jul 26, 2022 |
|
R |
4 |
Cantonese Enigma |
Jan 15, 2021 |
|
Python |
6 |
Combine Chinese and English subtitles, with optional Mandarin or Cantonese romanization and other features |
Sep 20, 2021 |
|
Python |
7 |
Curated corpus of parallel data derived from versions of the Bible provided by eBible.org. |
Jun 21, 2022 |
|
None |
2 |
English part of the ParTUT parallel treebank. |
Nov 03, 2021 |
|
Python |
163 |
chinese and english corpus process script, python, c++, java |
Apr 23, 2023 |
|
None |
15 |
English Lemma Database - Compiled by Referencing British National Corpus |
Aug 17, 2022 |
|
None |
6 |
Full context labels for VCTK corpus extracted by Merlin & speech tools |
Jun 23, 2022 |
|
Python |
3 |
Train a jp2zh NMT model with ACG parallel corpus. |
May 13, 2022 |
|
Java |
2 |
The cantonese playground |
May 16, 2022 |
|
None |
2 |
A YouTube speech corpus to study Asian North American English. |
Feb 22, 2023 |
|
Python |
2 |
Train model based on Wikipedia English corpus with gensim package |
May 13, 2023 |
|
Python |
6 |
This Repository contains parallel Sanskrit and English Documents. |
Nov 25, 2020 |
|
Jupyter Notebook |
7 |
Multi-way parallel text corpus of 5 key Ugandan languages. |
Apr 19, 2023 |
|
None |
2 |
Aligned Catalan-German and Catalan-English Europarl corpus. Catalan sentences translated from Spanish using Apertium RBMT. |
Nov 24, 2022 |
|
C++ |
95 |
AEC3 Extracted From WebRTC |
Aug 15, 2022 |
|
C++ |
4 |
AEC3 Extracted From WebRTC |
Jan 15, 2023 |