Upload date
All time
Last hour
Today
This week
This month
This year
Type
All
Video
Channel
Playlist
Movie
Duration
Short (< 4 minutes)
Medium (4-20 minutes)
Long (> 20 minutes)
Sort by
Relevance
Rating
View count
Features
HD
Subtitles/CC
Creative Commons
3D
Live
4K
360°
VR180
HDR
3,575 results
Run massive AI models on your laptop! Learn the secrets of LLM quantization and how q2, q4, and q8 settings in Ollama can save ...
392,564 views
1 year ago
In this video, we discuss the fundamentals of model quantization, the technique that allows us to run inference on massive LLMs ...
43,916 views
9 months ago
In this video we define the basics of quantization and look at how its benefits and how it affects large language models.
28,075 views
2 years ago
Quantizing models for maximum efficiency gains! Resources: Model Quantized: ...
22,285 views
Welcome back to the Ollama course! In this lesson, we dive into the fascinating world of AI model quantization. Using variations of ...
29,063 views
This video explores DeepSeek R1, how distilled versions and quantization make it more accessible, and the trade-offs between ...
22,906 views
Are 1-bit LLMs the future of efficient AI? Or just a catchy Microsoft metaphor? In this video, we break down BitNet, the so-called ...
87,119 views
8 months ago
I Made ChatGPT-2 Run on a Potato (63MB AI Model!) - Extreme Quantization Experiment What happens when you compress a ...
469,247 views
5 months ago
Large Language Models (LLMs) are measured by the number of parameters they contain – the number of weights and biases ...
43,743 views
Can you really train a large language model in just 4 bits? In this video, we explore the cutting edge of model compression: fully ...
49,586 views
In this video I will introduce and explain quantization: we will first start with a little introduction on numerical representation of ...
50,839 views
The first comprehensive explainer for the GGUF quantization ecosystem. GGUF quantization is currently the most popular tool for ...
48,585 views
7 months ago
QLoRA is the first approach that allows the TRAINING of Large Language Models (LLMs) on a single GPU. It does this by using ...
22,926 views
VIDEO TITLE What is LLM Quantization? ✍️VIDEO DESCRIPTION ✍️ Large Language Models (LLMs) are built using ...
2,983 views
11 months ago
Run AI Models Locally: Quantization Explained (Q2, Q3, Q4, Q5) Want to run large language models (LLMs) like Phi-4 on your PC ...
4,610 views
A NEW benchmark and guide which quantization models to use locally on your PC or laptop. Either in Ollama or in LM Studio, ...
3,916 views
6 months ago
Try Voice Writer - speak your thoughts and let AI handle the grammar: https://voicewriter.io Four techniques to optimize the speed ...
60,293 views
Download Tanka today https://www.tanka.ai and enjoy 3 months of free Premium! You can also get $20 / team for each referrals ...
364,229 views
Quantization is an excellent technique to compress Large Language Models (LLM) and accelerate their inference. In this video ...
22,845 views
Text:* https://github.com/The-Pocket/PocketFlow-Tutorial-Video-Generator/blob/main/docs/llm/quantization.md 0:00:00 ...
2,244 views
2 months ago