Session Title: The World of Spark in Azure
Speaker(s): Warner Chaves
Abstract: Apache Spark is an open-source unified analytics engine for large-scale data processing that has taken the data industry by storm since it’s inception in 2014.
With a clean, integrated interface for programming entire clusters, Spark shines with capabilities of implicit data parallelism and fault tolerance, making it a great open-source or proprietary platform for open ended big data processing. With a very broad range of capabilities and programmability, Spark can also run in many different ways and choosing the right one is a key component for any successful project.
In this session we will introduce Spark’s main concepts and the main ways to run it in Microsoft Azure: HDInsight, Databricks and Synapse!
500+ sessions are now available on-demand from Data Platform Summit 2022, 2021 & 2020 at no cost. Browse all sessions.
Stay tuned, more learning coming your way.