ViewTube

ViewTube
Sign inSign upSubscriptions
Filters

Upload date

Type

Duration

Sort by

Features

Reset

40 results

Andrej Baranovskij
How to Cache vLLM Model in FastAPI for Faster Inference

I show you how to keep your vLLM model loaded in FastAPI cache for much faster inference — without reloading it on every ...

7:47
How to Cache vLLM Model in FastAPI for Faster Inference

220 views

4 days ago

Runtime Fables
Self-Hosting a 30B AI Model 🤯 (No API, No Limits) | Sarvam-30B + vLLM

In this video, we walk through how to self-host the Sarvam-30B model using vLLM, one of the fastest and most efficient inference ...

6:01
Self-Hosting a 30B AI Model 🤯 (No API, No Limits) | Sarvam-30B + vLLM

26 views

2 days ago

ManuAGI - AutoGPT Tutorials
Trending Open-Source Github Projects: MoneyPrinterV2, vllm-omni, Unsloth, OpenGauss & RCLI #242

AI Agents Studio : https://www.youtube.com/channel/UCAawqobkJZ28OLcYcMgqYaw?sub_confirmation=1 "This video covers the ...

14:16
Trending Open-Source Github Projects: MoneyPrinterV2, vllm-omni, Unsloth, OpenGauss & RCLI #242

3,925 views

7 days ago