Stars
473
Forks
38
Language
Python
Last Updated
Dec 08, 2023
Similar Repos
Repo | Language | Stars | Description | Updated At |
---|---|---|---|---|
Python | 250 | CoVoST: A Large-Scale Multilingual Speech-To-Text Translation Corpus (CC0 Licensed) | Jul 28, 2022 | |
Python | 364 | A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation | Aug 02, 2022 | |
Python | 2 | Large scale web corpus of Austronesian text. | Jan 17, 2022 | |
None | 47 | A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages. | May 03, 2023 | |
None | 2 | A Large-scale Vietnamese News Text Classification Corpus | Dec 12, 2023 | |
Python | 231 | A Large Scale Text Summarization Dataset | Sep 15, 2022 | |
None | 50 | Large-Scale (~50M) Hotel Reviews Dataset | May 05, 2023 | |
None | 34 | A large-scale cleaned Chinese chitchat corpus and Chinese dialogpt models | Jun 19, 2022 | |
Python | 205 | VGGSound: A Large-scale Audio-Visual Dataset | Apr 26, 2023 | |
Python | 22 | A published large-scale dataset - Weibo User Depression Detection Dataset. | Aug 05, 2022 | |
Python | 4 | A large-scale dataset for visual place recognition | Jun 09, 2022 | |
None | 6 | ALTO (Aerial-view Large-scale Terrain-Oriented) dataset | Jun 09, 2022 | |
None | 8 | Large-scale query-focused multi-document Summarization dataset | Jan 15, 2022 | |
None | 231 | A large-scale dataset for face parsing (AAAI2020) | Jul 28, 2022 | |
Python | 646 | A Large-Scale Few-Shot Relation Extraction Dataset | Aug 13, 2022 | |
Python | 889 | COYO-700M: Large-scale Image-Text Pair Dataset | Apr 23, 2023 | |
None | 2 | A Large-Scale Urban Outdoor Point Cloud Dataset | May 06, 2022 | |
Jupyter Notebook | 2 | https://www.kaggle.com/crowww/a-large-scale-fish-dataset | Mar 30, 2023 | |
None | 12 | A Large-Scale Chinese Legal Case Retrieval Dataset | May 10, 2023 | |
Python | 2 | Sound augmentation using Large-scale audio dataset (Audioset) | May 03, 2023 | |
Python | 469 | [ECCV2020] A Large-Scale Face Anti-Spoofing Dataset | May 16, 2023 | |
Python | 98 | nablaDFT: Large-Scale Conformational Energy and Hamiltonian Prediction benchmark and dataset | May 01, 2023 | |
TeX | 4 | Large-scale, anonymous, randomized logging of type errors in Luau | Mar 16, 2023 | |
Python | 24 | Large scale unannotated Korean corpus for unsupervised tasks. (e.g. Language modeling) | Jun 21, 2022 | |
Python | 2 | Baby Abstract Reasoning Corpus (BabyARC) dataset engine, for generating grid-world-based abstract reasoning tasks on a … | Jun 30, 2023 | |
None | 9 | xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval | May 22, 2023 | |
Python | 10 | The World's First Large Scale Lidar Lane Detection Dataset and Benchmark | Jun 15, 2022 | |
JavaScript | 16 | Code and Dataset for Memeify: A Large-scale Meme Generation System | Apr 29, 2023 | |
Python | 13 | Hephaestus: A large scale multitask dataset towards InSAR understanding | Jun 16, 2022 | |
Python | 130 | SHIFT15M: multiobjective large-scale fashion dataset with distributional shifts | Aug 15, 2022 | |
None | 88 | S3E: A Large-scale Multimodal Dataset for Collaborative SLAM | Apr 20, 2023 | |
Python | 6 | Custom Iterable Dataset Class for Large-Scale Data Loading | Mar 16, 2023 | |
None | 6 | A large scale dataset for Image Captioning in Italian | Nov 02, 2022 | |
None | 11 | A large scale dataset for Video Captioning in Italian | May 16, 2023 | |
None | 21 | A large scale dataset for Question Answering in Italian | Sep 15, 2022 | |
Python | 27 | Large Scale Architectural Asset Dataset -- LSAA (IEEE TVCG 2020) | Feb 01, 2023 | |
Python | 25 | new large-scale dataset for vision-based drowsiness detection | Mar 11, 2024 | |
HTML | 12 | Multilingual parallel corpus,and tools for preprocessing text | Jul 26, 2022 | |
Python | 493 | Dataset and codes for ACL 2019 DocRED: A Large-Scale Document-Level Relation Extraction Dataset. | Aug 12, 2022 | |
Python | 1790 | A large-scale face dataset for face parsing, recognition, generation and editing. | Apr 25, 2023 | |
Python | 66 | The first large-scale tracking dataset by fusing RGB and Event cameras. | Apr 24, 2023 | |
Shell | 11 | Introduction to "Tencent’s Multilingual Machine Translation System for WMT22 Large-Scale African Languages". | Feb 23, 2023 | |
Shell | 2 | Code for the WMT21 paper "Back-translation for Large-Scale Multilingual Machine Translation" | Nov 10, 2022 | |
None | 14 | ALITA: A Large-scale Incremental Dataset for Long-term Autonomy | Jun 06, 2022 | |
C++ | 242 | [ECCV 2022 oral] OpenLane: Large-scale Realistic 3D Lane Dataset | Aug 18, 2022 | |
None | 285 | ModaNet: A large-scale street fashion dataset with polygon annotations | Jul 26, 2022 | |
C++ | 2 | A Large-scale Indoor-Outdoor integration dataset for robot Navigation. | May 08, 2022 | |
Python | 173 | Large-scale text-video dataset. 10 million captioned short videos. | Apr 25, 2023 | |
None | 51 | A large-scale multi-robot dataset for multi-robot SLAM | Apr 24, 2023 | |
Python | 3 | LS-HDIB: A Large Scale Handwritten Document Image Binarization Dataset | Jul 25, 2022 |