ViewTube

ViewTube
Sign inSign upSubscriptions
Filters

Upload date

Type

Duration

Sort by

Features

Reset

62 results

Red Hat Community
Getting Started with Inference Using vLLM

Steve Watt, PyTorch ambassador - Getting Started with Inference Using vLLM.

20:18
Getting Started with Inference Using vLLM

632 views

3 months ago

Red Hat Community
vLLM Semantic Router: Intelligent Auto Reasoning for Efficient LLM Inference on Mixture-of-Models

Huamin Chen, vLLM Semantic Router project creator - vLLM Semantic Router: Intelligent Auto Reasoning Router for Efficient LLM ...

32:57
vLLM Semantic Router: Intelligent Auto Reasoning for Efficient LLM Inference on Mixture-of-Models

132 views

3 months ago

Julien Simon
Deep Dive: Optimizing LLM inference

Open-source LLMs are great for conversational applications, but they can be difficult to scale in production and deliver latency ...

36:12
Deep Dive: Optimizing LLM inference

44,069 views

1 year ago

Vuk Rosić
Nano-vLLM - DeepSeek Engineer's Side Project - Code Explained

repo - https://github.com/GeeeekExplorer/nano-vllm/tree/main * Nano-vLLM is a simple, fast LLM server in \~1200 lines of Python ...

19:18
Nano-vLLM - DeepSeek Engineer's Side Project - Code Explained

1,529 views

7 months ago

CNCF [Cloud Native Computing Foundation]
Efficient LLM Deployment: A Unified Approach with Ray, VLLM, and Kubernetes - Lily (Xiaoxuan) Liu

Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon Europe in London from April 1 - 4, 2025.

27:08
Efficient LLM Deployment: A Unified Approach with Ray, VLLM, and Kubernetes - Lily (Xiaoxuan) Liu

3,859 views

1 year ago

Red Hat Community
Combining Kubernetes and vLLM to Deliver Scalable, Distributed Inference with llm-d

Greg Pereira, llm-d maintainer - Combining Kubernetes and vLLM to Deliver Scalable, Distributed Inference with llm-d.

28:26
Combining Kubernetes and vLLM to Deliver Scalable, Distributed Inference with llm-d

486 views

3 months ago

The Machine Learning Engineer
LLMOps : vLLM Integracion withLangchain #machinelearning #datascience

In this video we are going to make an introduction to vLLM technology and its integration with the Langchain library to create RAG ...

52:38
LLMOps : vLLM Integracion withLangchain #machinelearning #datascience

14,545 views

10 months ago

The Linux Foundation
Scalable and Efficient LLM Serving With the VLLM Production Stack - Junchen Jiang & Yue Zhu

Don't miss out! Join us at the next Open Source Summit in Hyderabad, India (August 5); Amsterdam, Netherland (August 25-29); ...

39:36
Scalable and Efficient LLM Serving With the VLLM Production Stack - Junchen Jiang & Yue Zhu

317 views

6 months ago

The Linux Foundation
Streamlining AI Pipelines With Elyra: From Development To Inference With KServe & VLLM - Ritesh Shah

Don't miss out! Join us at the next Open Source Summit in Seoul, South Korea (November 4-5). Join us at the premier ...

26:00
Streamlining AI Pipelines With Elyra: From Development To Inference With KServe & VLLM - Ritesh Shah

60 views

4 months ago

CNCF [Cloud Native Computing Foundation]
Yes You Can Run LLMs on Kubernetes - Abdel Sghiouar & Mofi Rahman, Google Cloud

Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon events in Hong Kong, China (June 10-11); ...

27:25
Yes You Can Run LLMs on Kubernetes - Abdel Sghiouar & Mofi Rahman, Google Cloud

825 views

9 months ago

The Machine Learning Engineer
LLMOPS : vLLM Inference LLM Server Engine #machinelearning #datascience

In this video I will introduce you the technology vLLM , a LLM Inference and Serving library. Notebooks: ...

45:45
LLMOPS : vLLM Inference LLM Server Engine #machinelearning #datascience

228 views

1 year ago

CNCF [Cloud Native Computing Foundation]
LLMs on Kubernetes: Squeeze 5x GPU Efficiency With Cache, Route, Repea... Yuhan Liu & Suraj Deshmukh

Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon events in Amsterdam, The Netherlands ...

32:31
LLMs on Kubernetes: Squeeze 5x GPU Efficiency With Cache, Route, Repea... Yuhan Liu & Suraj Deshmukh

474 views

2 months ago

Red Hat Community
Triton for vLLM

Description: Burkhard Ringlein, Chih-Chieh Yang, Sara Kokkila Schumacher, IBM and Rishi Astra, University of Texas - Triton for ...

37:46
Triton for vLLM

425 views

8 months ago

DevConf
Auto-tuning vllm - DevConf.US 2025

Speaker(s): Rehan Samaratunga My auto-tuning project aims to find the best settings for running large language models using ...

9:51
Auto-tuning vllm - DevConf.US 2025

80 views

3 months ago

Dutch Algotrading
Run Your Locally Hosted Deepseek, Qwen or Codellama AI Assistant in VSCode Under 5 Minutes!

Run your Locally hosted AI Coding Assistant in VSCode with Continue extension, Ollama, Deepseek, Qwen or CodeLlama in less ...

5:26
Run Your Locally Hosted Deepseek, Qwen or Codellama AI Assistant in VSCode Under 5 Minutes!

70,826 views

11 months ago

CNCF [Cloud Native Computing Foundation]
Tutorial: Cloud Native Sustainable LLM Inference in Action

Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon North America in Salt Lake City from ...

1:24:20
Tutorial: Cloud Native Sustainable LLM Inference in Action

487 views

1 year ago

CNCF [Cloud Native Computing Foundation]
Lightning Talk: Best Practices for LLM Serving with DRA - Chen Wang & Abhishek Malvankar, IBM

Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon North America in Salt Lake City from ...

9:37
Lightning Talk: Best Practices for LLM Serving with DRA - Chen Wang & Abhishek Malvankar, IBM

475 views

1 year ago

CNCF [Cloud Native Computing Foundation]
Sailing Multi-host Inference for LLM on Kubernetes - Kay Yan, DaoCloud

Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon India in Hyderabad (August 6-7), and ...

28:03
Sailing Multi-host Inference for LLM on Kubernetes - Kay Yan, DaoCloud

306 views

6 months ago

Python Code Camp
Chat with PDF langchain project

shorts #short #shortvideo #python #pythonprogramming #pythonshorts #pythontips #pythontricks #chatgpt #langchain ...

0:25
Chat with PDF langchain project

52,612 views

1 year ago

The Machine Learning Engineer
LLMOps : vLLM Integracion con Langchain (Español) #machinelearning #datascience

En este Vídeo vamos a hacer una Introducción a la Tecnología vLLM y a su integracion con la libreria Lanchain para crear ...

56:03
LLMOps : vLLM Integracion con Langchain (Español) #machinelearning #datascience

5,579 views

10 months ago