LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Stars

460

Forks

33

Language

Python

Last Updated

May 11, 2024

Similar Repos

Repo	Language	Stars	Description	Updated At
unilm	None	7	UniLM - Unified Language Model Pre-training / Pre-training for NLP and Beyond	Jun 03, 2022
Flaubert	Python	224	Unsupervised Language Model Pre-training for French	May 28, 2023
TextPruner	Python	210	A PyTorch-based model pruning toolkit for pre-trained language models	Oct 13, 2022
PERT	None	180	PERT: Pre-training BERT with Permuted Language Model	Aug 13, 2022
llama.mmengine	Python	35	Training LLaMA language model with MMEngine! It supports LoRA fine-tuning!	Apr 09, 2023
Union	Python	2	Unifying Language-Image Pre-training via Single-Tower Transformer	Jan 30, 2023
aws-lex-retrieval-extraction-lm-pt	Python	3	Examples for pre-training retrieval-extraction based language model	Oct 25, 2021
MSG	Python	16	Masked Structural Growth for 2x Faster Language Model Pre-training	Jan 21, 2024
lit-llama	Python	1639	Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, quantization, LoRA fine-tuning, …	Apr 01, 2023
lit-llama	Python	2	Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, quantization, LoRA fine-tuning, …	May 31, 2023
llama-tools	Python	3	Tools for the LLaMA language model	Apr 09, 2023
XNLG	Python	110	AAAI-20 paper: Cross-Lingual Natural Language Generation via Pre-Training	Apr 24, 2022
PMR	Python	4	Pre-training Machine-Reader (Instead of Masked Language Model) at Scale	Jun 03, 2023
GLIP	Python	574	Grounded Language-Image Pre-training	Sep 01, 2022
GLIP	Python	5	Grounded Language-Image Pre-training	Mar 31, 2023
SpliceBERT	None	6	Pre-mRNA language model	Apr 19, 2023
llama-classification	Python	6	Text classification with Foundation Language Model LLaMA	Apr 06, 2023
VLMixer	None	14	VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix (ICML 2022)	Feb 21, 2023
Open-Llama	Python	44	The complete training code of the open-source high-performance Llama model, including the full process from …	Apr 09, 2023
open-Chinese-ChatLLaMA	Python	39	The complete training code of the open-source Chinese-Llama model, including the full process from pre-training …	Apr 28, 2023
Open-Llama	None	4	The complete training code of the open-source high-performance Llama model, including the full process from …	Jul 12, 2023
Open-Llama	None	12	The complete training code of the open-source high-performance Llama model, including the full process from …	Jun 19, 2023
Chinese-GPT	Jupyter Notebook	53	Chinese Transformer Generative Pre-Training Model	Apr 02, 2023
PALM	Python	32	PALM: Pre-training an Autoencoding & Autoregressive Language Model for Context-conditioned Generation	Apr 22, 2023
GraNet	Python	14	[Neurips 2021] Sparse Training via Boosting Pruning Plasticity with Neuroregeneration	Jul 31, 2022
UER-py	Python	2221	Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo	Sep 10, 2022
UER-py	None	2	Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo	Oct 12, 2020
mozilla_deepSpeech_rus	Shell	3	training russian language model	Feb 21, 2023
VisorGPT	None	9	Learning Visual Prior via Generative Pre-Training	May 24, 2023
finetune-transformer-lm	Python	1628	Code and model for the paper "Improving Language Understanding by Generative Pre-Training"	Aug 10, 2022
finetune-transformer-lm	Python	2	Code and model for the paper "Improving Language Understanding by Generative Pre-Training"	Jan 28, 2023
GRAIN	Python	5	GRAIN: Gradient-based Intra-attention Pruning on Pre-trained Language Models	Aug 25, 2023
TencentPretrain	Python	3	Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo	Oct 11, 2022
TencentPretrain	None	3	Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo	Feb 17, 2023
only_train_once	Python	241	OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, LLM	Jan 10, 2024
LD-Net	Python	148	Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling	Apr 27, 2023
LAMP	Python	8	Code for NAACL 2022 Findings paper: Specializing Pre-trained Language Models for Better Relational Reasoning via …	Jan 22, 2023
LLM-Distributed-Quantization	Python	2	Accelerating multi-node Large Language Model training with per-layer selective quantization (FP32 -> FP16) of the …	Oct 07, 2022
awesome-vision-language-modeling	None	10	Recent Advances in Vision-Language Pre-training!	Jul 20, 2022
OmDet	None	7	Object Detection with Vision-Language Pre-training	Apr 10, 2023
data-centric.vlp	None	4	Compress conventional Vision-Language Pre-training data	Jun 01, 2023
llama-bot	Python	2	Discord bot for interacting with the LLaMA language model	Apr 02, 2023
language_model_transformer	Python	6	language model via transformer	May 05, 2022
pretrained-models	None	819	Open Language Pre-trained Model Zoo	Aug 23, 2022
Caffe_IncReg	Makefile	13	[IJCNN'19, IEEE JSTSP'19] Caffe code for our paper "Structured Pruning for Efficient ConvNets via Incremental …	Jun 10, 2021
rnn.wgan	Python	254	Code for training and evaluation of the model from "Language Generation with Recurrent Generative Adversarial …	May 04, 2023
clip-italian	Jupyter Notebook	89	CLIP (Contrastive Language–Image Pre-training) for Italian	Jun 30, 2022
ContrastivePruning	Python	22	Source code for our AAAI'22 paper 《From Dense to Sparse: Contrastive Pruning for Better Pre-trained …	Feb 08, 2023
language-model	Python	4	large language model training and deploy	Apr 25, 2023
VarCLR	Python	30	VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning	Jul 06, 2022