spark optimization techniques databricks

A Deep Dive into Spark SQL's Catalyst Optimizer (Cheng Lian + Maryann Xue, DataBricks)

CMU Database Group - Quarantine Tech Talks (2020) Speaker: Cheng Lian + Maryann Xue (DataBricks) A Deep Dive into Spark ...

1:08:50

A Deep Dive into Spark SQL's Catalyst Optimizer (Cheng Lian + Maryann Xue, DataBricks)

6,083 views

5 years ago

GKCodeLabs

Reading SparkSession from main arguments | Big Data Day to Day Work | Work as a team

In last video I have discussed about the program where you can develop big data project, and see big data use cases that include ...

23:16

Reading SparkSession from main arguments | Big Data Day to Day Work | Work as a team

1,323 views

5 years ago

Surfalytics | Fast Track to Data Career

Spark UI and Query plan review project | Surfalytics

This video provides a practical guide to using Apache Spark, focusing on running Spark locally with Docker and Databricks ...

1:00:34

Spark UI and Query plan review project | Surfalytics

214 views

1 year ago

FOSDEM

Maggy: Asynchronous distributed hyperparameter optimization based on Apache Spark Asynchronous algorithms on a ...

25:02

Maggy: Asynchronous distributed hyperparameter optimization based on Apache Spark Asynchronous algo…

630 views

5 years ago

CMU Database Group

20 - Databricks Photon / Spark SQL (CMU Advanced Databases / Spring 2023)

Prof. Andy Pavlo (https://www.cs.cmu.edu/~pavlo/) Slides: https://15721.courses.cs.cmu.edu/spring2023/slides/20-databricks.pdf ...

1:07:58

20 - Databricks Photon / Spark SQL (CMU Advanced Databases / Spring 2023)

15,149 views

2 years ago

GKCodeLabs

Issues in Big Data Projects | Interview Question | 10 Issues Answered

In this video I have explained 10 Big data issues, that I have faced in Big data project, while working on them. To become a ...

21:18

Issues in Big Data Projects | Interview Question | 10 Issues Answered

19,686 views

5 years ago

Surfalytics | Fast Track to Data Career

Real Interview Q&A for Senior Data Engineer #1 | Surfalytics

Gain a unique perspective on the technical and problem-solving skills expected of senior data engineers in this very little edited ...

30:26

Real Interview Q&A for Senior Data Engineer #1 | Surfalytics

19,299 views

1 year ago

BigDatapediaAI

Apache Spark Performance Tuning | Resource Allocation with 4 Node Cluster | Hands on Demo

Apache Spark Performance Tuning | Resource Allocation with 4 Node Cluster | Hands on Example Discussion Forum ...

54:37

Apache Spark Performance Tuning | Resource Allocation with 4 Node Cluster | Hands on Demo

1,409 views

4 years ago

FOSDEM

Validating Big Data Jobs An exploration with Spark & Airflow (+ friends)

by Holden Karau At: FOSDEM 2019 https://video.fosdem.org/2019/UA2.118/validating_big_data_jobs.webm If you, like close to ...

23:33

Validating Big Data Jobs An exploration with Spark & Airflow (+ friends)

497 views

6 years ago

Plain Schwarz

Prashanth Babu – Simplifying upserts and deletes on Delta Lake tables

Data Engineers face many challenges with Data Lakes. GDPR requests, data quality issues, handling large metadata, merges ...

31:14

Prashanth Babu – Simplifying upserts and deletes on Delta Lake tables

183 views

4 years ago

EuroPython Conference

Anna Veronika Dorogush - CatBoost - the new generation of Gradient Boosting

CatBoost - the new generation of Gradient Boosting [EuroPython 2018 - Talk - 2018-07-26 - PyCharm [PyData]] [Edinburgh, UK] ...

42:04

Anna Veronika Dorogush - CatBoost - the new generation of Gradient Boosting

14,452 views

7 years ago

The ASF

Storage-Partitioned Join for Apache Spark - Chao Sun, Ryan Blue

Thank you to all of of our ApacheCon@Home 2021 sponsors, including: STRATEGIC --------------- Google PLATINUM ...

39:00

Storage-Partitioned Join for Apache Spark - Chao Sun, Ryan Blue

3,293 views

4 years ago

ScalaIO FR

Chetan Khatri - TransmogrifAI - Automate ML Workflow with power of Scala and Spark at massive scale.

In spite of huge progress in Artificial Intelligence and Machine Learning over the past decade, building production ready ...

47:54

Chetan Khatri - TransmogrifAI - Automate ML Workflow with power of Scala and Spark at massive scale.

664 views

7 years ago

The ASF

Faster Bigdata Analytics By Maneuvering Apache Carbondata’S Indexes

Data in the 21st Century is like Oil in the 18th Century: an immensely, untapped valuable asset if processed in an intelligent way.

40:24

Faster Bigdata Analytics By Maneuvering Apache Carbondata’S Indexes

252 views

4 years ago

The Linux Foundation

Hyperparameter Tuning Using Kubeflow - Richard Liu, Google & Johnu George, Cisco Systems In machine learning, ...

35:06

Hyperparameter Tuning Using Kubeflow - Richard Liu, Google & Johnu George

591 views

6 years ago

The ASF

Accelerating distributed joins in Apache Hive: Runtime filtering enhancements Panagiotis Garefalakis, Stamatis Zampetakis A ...

40:01

Accelerating distributed joins in Apache Hive: Runtime filtering enhancements

245 views

5 years ago

DevConf

Streaming Functions as a Service with Apache Spark and Kubernetes

Speaker: Michael McCune Although there are several popular frameworks that have arisen in the past few years to address the ...

32:17

Streaming Functions as a Service with Apache Spark and Kubernetes

162 views

6 years ago

Plain Schwarz

Berlin Buzzwords 2017: Marcin Szymaniuk - Apache Spark? If only it worked #bbuzz

Do you have plans to start working with Apache Spark? Are you already working with Spark but you haven't gotten the expected ...

37:58

Berlin Buzzwords 2017: Marcin Szymaniuk - Apache Spark? If only it worked #bbuzz

760 views

8 years ago

CMU Database Group

Query Optimization and Acceleration at Dremio (Steven Phillips + Vivekanand Vellanki)

CMU Database Group - Vaccination Database Tech Talks - Second Dose (2021) Speakers: Steven Phillips + Vivekanand Vellanki ...

1:03:52

Query Optimization and Acceleration at Dremio (Steven Phillips + Vivekanand Vellanki)

3,181 views

4 years ago

BelPy

Automatic Feature Engineering on Large Scale Time Series Data using tsfresh & Dask Arnab Biswas

... the feature extraction process to a clustered ts fresh handles larger than memory data utilizing task and pi spark however i am ...

37:33

Automatic Feature Engineering on Large Scale Time Series Data using tsfresh & Dask Arnab Biswas

1,892 views

4 years ago

ViewTube