Alpaca-LoRA-RLHF-PyTorch

A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with …

Stars

53

Forks

6

Language

Python

Last Updated

Apr 23, 2024

Similar Repos