Upload date
All time
Last hour
Today
This week
This month
This year
Type
All
Video
Channel
Playlist
Movie
Duration
Short (< 4 minutes)
Medium (4-20 minutes)
Long (> 20 minutes)
Sort by
Relevance
Rating
View count
Features
HD
Subtitles/CC
Creative Commons
3D
Live
4K
360°
VR180
HDR
23,928 results
Want to play with the technology yourself? Explore our interactive demo → https://ibm.biz/BdKSby Learn more about the ...
84,357 views
1 year ago
Generative Large Language Models, like ChatGPT and DeepSeek, are trained on massive text based datasets, like the entire ...
55,547 views
11 months ago
Understanding Reinforcement Learning with Human Feedback (RLHF) – A short clip from my talk at the 2023 Optimized AI ...
13,828 views
Welcome to The RLHF Book Course with Nathan Lambert. All resources will be available at https://rlhfbook.com/ Order a copy of ...
5,837 views
5 days ago
I asked an AI model to ignore its filters and teach me how to shoplift. The standard fine-tune complied immediately.
401 views
4 months ago
In this talk, we will cover the basics of Reinforcement Learning from Human Feedback (RLHF) and how this technology is being ...
188,273 views
Streamed 3 years ago
AI popularizer New Machina introduced another crucial concept in machine learning: reinforcement learning with human ...
26 views
6 months ago
In this video, I will explain Reinforcement Learning from Human Feedback (RLHF) which is used to align, among others, models ...
69,304 views
2 years ago
Learn how Reinforcement Learning from Human Feedback (RLHF) actually works and why Direct Preference Optimization (DPO) ...
17,563 views
Reinforcement Learning from human feedback, and how it's used to help train large language models like ChatGPT. Part 3 of RL ...
24,544 views
Don't like the Sound Effect?:* https://youtu.be/6xEXyJAbYns *LLM Training Playlist:* ...
5,052 views
We talk about reinforcement learning through human feedback. ChatGPT among other applications makes use of this. ABOUT ME ...
29,393 views
8,735 views
In this video, I break down Proximal Policy Optimization (PPO) from first principles, without assuming prior knowledge of ...
52,256 views
Have you ever wondered why ChatGPT, Claude, and other advanced AI models feel so much more "human" and helpful than the ...
140 views
2 months ago
Reinforcement Learning with Human Feedback (RLHF) is a method used for training Large Language Models (LLMs). In the heart ...
34,557 views
Enroll in Generative AI with LLMs here: https://bit.ly/3rVrDB6 Join us for a hands-on workshop where you will learn about ...
24,967 views
Streamed 2 years ago
In this video we dive into Generative Reward Models, introduced in a fascinating recent AI research paper by Stanford University.
2,214 views