Stars
196
Forks
18
Language
Python
Last Updated
May 14, 2024
Similar Repos
Repo | Language | Stars | Description | Updated At |
---|---|---|---|---|
Python | 7 | A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation … | Apr 23, 2023 | |
Python | 15 | A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation … | Apr 24, 2023 | |
Python | 74 | Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically … | Dec 12, 2022 | |
Python | 2 | Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically … | May 21, 2023 | |
None | 2 | Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically … | Nov 30, 2023 | |
Python | 5 | Finetuning alpaca with RLHF (Reinforcement Learning with Human Feedback) | Apr 25, 2023 | |
Jupyter Notebook | 87 | Implementation of Reinforcement Learning from Human Feedback (RLHF) | Mar 29, 2023 | |
Python | 6 | Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically … | Apr 21, 2023 | |
Python | 4 | Train reward models for reinforcement learning from human feedback (RLHF). | Aug 28, 2023 | |
Python | 51 | baichuan LLM surpervised finetune by lora | Jan 15, 2024 | |
Python | 49 | Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback | May 16, 2023 | |
Python | 2 | Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback | Oct 22, 2023 | |
Python | 3061 | A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF) | Apr 25, 2023 | |
Jupyter Notebook | 14 | finetune stable diffusion with Dreambooth、LoRA、ControlNet | Jun 30, 2023 | |
Jupyter Notebook | 27 | Try original alpaca. The multi-turn version is at [multi-turn-alpaca](https://github.com/l294265421/multi-turn-alpaca) and the version further trained with … | Apr 25, 2023 | |
None | 27 | A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval. | May 09, 2023 | |
Python | 18 | Finetune Bloom big language model with Lora method | Apr 19, 2023 | |
Python | 2 | Alpaca-lora Chatbot with Infinite Memory by Finetune | Apr 23, 2023 | |
Python | 5 | 用RLHF可选LoRA对LLaMA和MOSS进行训练|Training LLaMA or MOSS with RLHF [LoRA] | Jun 12, 2023 | |
None | 972 | A curated list of reinforcement learning with human feedback resources (continually updated) | Apr 24, 2023 | |
None | 774 | Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human … | Apr 25, 2023 | |
Python | 23 | LLM chatbot server with ChatGPT plugins | Apr 11, 2023 | |
Python | 285 | LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA) | Jan 19, 2024 | |
None | 26 | Curated list of resources for Reinforcement Learning from Human Feedback and Language Models | Apr 24, 2023 | |
Python | 112 | A minimum example of aligning language models with RLHF similar to ChatGPT | Apr 09, 2023 | |
Python | 52 | Enhance your ML workflows with human feedback | May 24, 2023 | |
JavaScript | 53 | 🗣️ Chat with LLM like Vicuna totally in your browser with WebGPU, safely, privately, and … | May 11, 2023 | |
Python | 5 | langchain-streamlit demo with streaming llm, memory, and langsmith feedback | Oct 16, 2023 | |
Jupyter Notebook | 30 | A Practical Guide to Developing a Reliable FAQ Chatbot with Reinforcement Learning and Human Feedback … | Mar 13, 2023 | |
Python | 36 | A Discord Bot for chatting with LLaMA, Vicuna, Alpaca, or any other LLM supported by … | Apr 09, 2023 | |
Python | 2 | Reinforcement learning for human walking motion with prosthetic leg | Sep 26, 2019 | |
Python | 6 | Cooperative Multi Agent Reinforcement Learning with Human in the Loop | Apr 24, 2023 | |
Python | 6 | A Discord Bot for chatting with LLaMA, Vicuna, Alpaca, or any other LLM supported by … | May 08, 2023 | |
Python | 7 | ChatGPT re-created with GPT-3.5 LLM as Telegram Bot. Light-weight fork. | Apr 14, 2023 | |
None | 12 | Implementations of Baseline Methods for Aligning Text2Img Diffusion Models with Human FeedBack | Apr 24, 2023 | |
None | 3 | EMNLP 2020: "Dialogue Response Ranking Training with Large-Scale Human Feedback Data" | May 27, 2022 | |
Python | 22 | (TNNLS) Prioritized Experience-Based Reinforcement Learning with Human Guidance for Autonomous Driving | Apr 24, 2023 | |
Python | 2 | (TNNLS) Prioritized Experience-Based Reinforcement Learning with Human Guidance for Autonomous Driving | Aug 05, 2023 | |
None | 8 | A primer on large language models (LLM) as of Jan 2023, with bonus ChatGPT topic | Mar 22, 2023 | |
Python | 2 | Connects nodes (tanks, cars, anything basically) together to be controlled by a centralised environment with … | Dec 21, 2021 | |
None | 14 | Collection of open source implementations of LLMs with IFT and RLHF that are striving to … | Apr 09, 2023 | |
Go | 144 | miti is a musical instrument textual interface. Basically, its MIDI, but with human-readable text. :musical_note: | Jul 29, 2022 | |
Python | 102 | (T-ITS) Driving Behavior Modeling using Naturalistic Human Driving Data with Inverse Reinforcement Learning | May 04, 2023 | |
Python | 2 | [ICCV21, Oral] PyMAF: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop | Dec 07, 2022 | |
Python | 2 | [ICCV21, Oral] PyMAF: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop | Feb 28, 2023 | |
TypeScript | 2 | A code editor integrated with ChatGPT to provide realtime analysis and feedback. Made using T3 | Apr 23, 2023 | |
C++ | 3 | In this Lora IoT project tutorial, I have shown how to make the LoRa Arduino … | Oct 04, 2022 | |
Python | 374 | [ICCV 2021, Oral] PyMAF: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback … | Aug 17, 2022 | |
Python | 38 | Automatically generate Anki Flashcards from your PDF files with LLM (ChatGPT in this case) to … | Jun 17, 2023 | |
Python | 27 | Check AppImages for compatibility, best practices etc. Powerful functionality combined with simple usage and human-friendly … | Sep 17, 2022 |