RLHF :: ViewTube

Reinforcement Learning from Human Feedback (RLHF) Explained

Want to play with the technology yourself? Explore our interactive demo → https://ibm.biz/BdKSby Learn more about the ...

11:29

Reinforcement Learning from Human Feedback (RLHF) Explained

84,357 views

1 year ago

StatQuest with Josh Starmer

Reinforcement Learning with Human Feedback (RLHF), Clearly Explained!!!

Generative Large Language Models, like ChatGPT and DeepSeek, are trained on massive text based datasets, like the entire ...

18:02

Reinforcement Learning with Human Feedback (RLHF), Clearly Explained!!!

55,547 views

11 months ago

Sebastian Raschka

Reinforcement Learning with Human Feedback (RLHF) in 4 minutes

Understanding Reinforcement Learning with Human Feedback (RLHF) – A short clip from my talk at the 2023 Optimized AI ...

4:06

Reinforcement Learning with Human Feedback (RLHF) in 4 minutes

13,828 views

1 year ago

Nathan Lambert

RLHF and Post-training Overview | RLHF Book Course, Lecture 1

Welcome to The RLHF Book Course with Nathan Lambert. All resources will be available at https://rlhfbook.com/ Order a copy of ...

46:10

RLHF and Post-training Overview | RLHF Book Course, Lecture 1

5,837 views

5 days ago

Shane | LLM Implementation

Stop Using RLHF: How to Align & Control LLMs (DPO Guide)

I asked an AI model to ignore its filters and teach me how to shoplift. The standard fine-tune complied immediately.

10:38

Stop Using RLHF: How to Align & Control LLMs (DPO Guide)

401 views

4 months ago

Hugging Face

Reinforcement Learning from Human Feedback: From Zero to chatGPT

In this talk, we will cover the basics of Reinforcement Learning from Human Feedback (RLHF) and how this technology is being ...

1:00:38

Reinforcement Learning from Human Feedback: From Zero to chatGPT

188,273 views

Streamed 3 years ago

Allow AI

AI popularizer New Machina introduced another crucial concept in machine learning: reinforcement learning with human ...

5:07

What Is RLHF? Simple Guide (2025)

26 views

6 months ago

Umar Jamil

Reinforcement Learning from Human Feedback explained with math derivations and the PyTorch code.

In this video, I will explain Reinforcement Learning from Human Feedback (RLHF) which is used to align, among others, models ...

2:15:13

Reinforcement Learning from Human Feedback explained with math derivations and the PyTorch code.

69,304 views

2 years ago

Mark Hennings

Learn how Reinforcement Learning from Human Feedback (RLHF) actually works and why Direct Preference Optimization (DPO) ...

19:39

RLHF Explained (and DPO!)

17,563 views

1 year ago

Graphics in 5 Minutes

Reinforcement Learning: ChatGPT and RLHF

Reinforcement Learning from human feedback, and how it's used to help train large language models like ChatGPT. Part 3 of RL ...

6:31

Reinforcement Learning: ChatGPT and RLHF

24,544 views

2 years ago

Zachary Huang

Don't like the Sound Effect?:* https://youtu.be/6xEXyJAbYns *LLM Training Playlist:* ...

1:30:36

RLHF in 90 min

5,052 views

6 months ago

CodeEmporium

Reinforcement Learning through Human Feedback - EXPLAINED! | RLHF

We talk about reinforcement learning through human feedback. ChatGPT among other applications makes use of this. ABOUT ME ...

10:17

Reinforcement Learning through Human Feedback - EXPLAINED! | RLHF

29,393 views

2 years ago

AI Thought

6:34

W2 9 How LLMs follow instructions, Instruction tuning and RLHF

8,735 views

2 years ago

Julia Turc

Proximal Policy Optimization (PPO) for LLMs Explained Intuitively

In this video, I break down Proximal Policy Optimization (PPO) from first principles, without assuming prior knowledge of ...

22:03

Proximal Policy Optimization (PPO) for LLMs Explained Intuitively

52,256 views

1 year ago

AI Academy

RLHF Explained: The "Secret Sauce" That Makes ChatGPT & Claude Actually Useful

Have you ever wondered why ChatGPT, Claude, and other advanced AI models feel so much more "human" and helpful than the ...

12:44

RLHF Explained: The "Secret Sauce" That Makes ChatGPT & Claude Actually Useful

140 views

2 months ago

Luis Serrano Academy

Reinforcement Learning with Human Feedback (RLHF) - How to train and fine-tune Transformer Models

Reinforcement Learning with Human Feedback (RLHF) is a method used for training Large Language Models (LLMs). In the heart ...

15:31

Reinforcement Learning with Human Feedback (RLHF) - How to train and fine-tune Transformer Models

34,557 views

2 years ago

DeepLearningAI

Mastering RLHF with AWS: A Hands-on Workshop on Reinforcement Learning from Human Feedback

Enroll in Generative AI with LLMs here: https://bit.ly/3rVrDB6 Join us for a hands-on workshop where you will learn about ...

1:01:01

Mastering RLHF with AWS: A Hands-on Workshop on Reinforcement Learning from Human Feedback

24,967 views

Streamed 2 years ago

AI Papers Academy

Generative Reward Models: Merging the Power of RLHF and RLAIF for Smarter AI

In this video we dive into Generative Reward Models, introduced in a fascinating recent AI research paper by Stanford University.

7:51

Generative Reward Models: Merging the Power of RLHF and RLAIF for Smarter AI

2,214 views

1 year ago