Stars
26
Forks
6
Language
Python
Last Updated
Jun 22, 2023
Similar Repos
Repo | Language | Stars | Description | Updated At |
---|---|---|---|---|
Python | 1044 | Unsupervised Language Modeling at scale for robust sentiment classification | Jun 30, 2022 | |
Python | 5 | Unsupervised Language Modeling at scale for robust sentiment classification | Aug 14, 2022 | |
Python | 160 | Tools for curating biomedical training data for large-scale language modeling | Jul 13, 2022 | |
Python | 2 | Large scale web corpus of Austronesian text. | Jan 17, 2022 | |
None | 2 | A Large-scale Vietnamese News Text Classification Corpus | Dec 12, 2023 | |
None | 320 | Awesome list of Korean Large Language Models. | Jan 17, 2024 | |
Python | 68 | Evaluation suite for large-scale language models. | Jul 07, 2022 | |
Python | 2 | Baby Abstract Reasoning Corpus (BabyARC) dataset engine, for generating grid-world-based abstract reasoning tasks on a … | Jun 30, 2023 | |
C++ | 849 | Scalable, fast, and lightweight system for large-scale topic modeling | Jun 12, 2022 | |
None | 34 | A large-scale cleaned Chinese chitchat corpus and Chinese dialogpt models | Jun 19, 2022 | |
Python | 872 | Evolutionary Scale Modeling (esm): Pretrained language models for proteins | Aug 10, 2022 | |
Python | 2 | Evolutionary Scale Modeling (esm): Pretrained language models for proteins | Jan 20, 2024 | |
Python | 250 | CoVoST: A Large-Scale Multilingual Speech-To-Text Translation Corpus (CC0 Licensed) | Jul 28, 2022 | |
Java | 16 | An object-oriented language for modeling large-scale neural systems, along with an IDE for writing and … | May 24, 2023 | |
HTML | 1159 | A large annotated semantic parsing corpus for developing natural language interfaces. | Aug 12, 2022 | |
C | 16 | [NeurIPS 2021] Official Matlab implementation of LOD: Large-Scale Unsupervised Object Discovery. | Aug 03, 2022 | |
Python | 364 | A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation | Aug 02, 2022 | |
Python | 116 | (TPAMI2022) The ImageNet-S benchmark/method for large-scale unsupervised/semi-supervised semantic segmentation. | May 22, 2023 | |
Python | 446 | GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors | Aug 31, 2022 | |
Python | 38 | :bug: Legacy dragonfly plugin for large-scale climate and urban heat island modeling. | Apr 17, 2022 | |
Python | 41 | Code for text augmentation method leveraging large-scale language models | Aug 17, 2022 | |
Python | 4 | language detection and topic modeling on multi-terabyte common crawl corpus | Apr 15, 2021 | |
None | 673 | Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料 | Apr 28, 2023 | |
None | 9 | Code and workflows for large-scale genomic search of e.g. the Sequence Read Archive | Jan 11, 2023 | |
Python | 6840 | Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities | Oct 06, 2022 | |
Python | 56 | Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities | Aug 10, 2022 | |
Haskell | 110 | SmartyPants for Korean language | Apr 19, 2023 | |
Python | 7 | A visualization interface for analyzing a (very large) corpus of natural-language queries. | Aug 28, 2022 | |
Jupyter Notebook | 20 | A framework for the large scale analysis of programming language usage. | Jun 13, 2022 | |
Python | 7 | Corpus created from the italki website for use in Native Language Identification tasks | Jul 13, 2022 | |
Python | 23 | NIST SPH File reader (e.g. for TEDLIUM Corpus) | Jan 16, 2023 | |
Jupyter Notebook | 12 | Large language modeling applied to T-cell receptor (TCR) sequences. | Jul 16, 2022 | |
Python | 67 | Code for papers "A Surprisingly Robust Trick for Winograd Schema Challenge" and "WikiCREM: A Large … | Jun 16, 2022 | |
None | 54 | A large-scale complex question answering evaluation of ChatGPT and similar large-language models | Mar 21, 2023 | |
None | 3 | A large-scale complex question answering evaluation of ChatGPT and similar large-language models | May 07, 2023 | |
Python | 35 | This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models. | Jun 21, 2023 | |
Python | 5 | Generator for Large Scale Structure | Apr 12, 2022 | |
Python | 1687 | Large-scale pretraining for dialogue | Aug 07, 2022 | |
C++ | 9 | This repository provides code for SVD and Importance sampling-based algorithms for large scale topic modeling. | Apr 20, 2021 | |
None | 2 | CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society | Apr 06, 2023 | |
Python | 430 | Can large language models provide useful feedback on research papers? A large-scale empirical analysis. | Jan 18, 2024 | |
Python | 48 | A framework to empower quantitative modeling using Large Language Models (LLMs) | May 24, 2023 | |
Python | 3447 | A large-scale 7B pretraining language model developed by BaiChuan-Inc. | Jun 19, 2023 | |
Python | 13 | Code for co-training large language models (e.g. T0) with smaller ones (e.g. BERT) to boost … | May 17, 2023 | |
Python | 49 | Unsupervised Domain Adaptation for Computer Vision Tasks | Jul 29, 2022 | |
Python | 1845 | 🐫 CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society | Apr 25, 2023 | |
Python | 82 | WebNav: A New Large-Scale Task for Natural Language based Sequential Decision Making | Mar 29, 2022 | |
Python | 102 | A large-scale language model for scientific domain, trained on redpajama arXiv split | Jan 17, 2024 | |
Python | 2 | Code repo for the paper: "SubMix: Practical Private Prediction for Large-scale Language Models" | Jan 08, 2022 | |
None | 2 | HateBR is the first large-scale expert annotated corpus of Brazilian Instagram comments for hate speech … | Jun 13, 2022 |