Session Title: AI And Analytics With Apache Spark And Azure Databricks
Speaker: Andrew Brust

Abstract: Open source technology Apache Spark is the analytics and machine learning platform of choice for many companies. While Spark has manifested in numerous parts of the Microsoft stack, including HDInsight, Synapse Analytics and even SQL Server 2019, Microsoft’s go-to Spark service is Azure Databricks.

The service, from Microsoft and Databricks (the company founded by Spark’s creators), is a versatile one, geared towards data lake management, analytics, data engineering and data science. Azure Databricks lets developers work in notebooks, offline, interactively with running clusters, or scheduled as production jobs that provision Spark clusters on-demand.

This session will cover the concepts, service mechanics, and code necessary for you to do analytics and machine learning on Azure Databricks, and integrate it with other Microsoft cloud services and on-premises technologies.

You will learn:

About the fundamentals of Apache Spark, Spark SQL and Spark MLlib

How to use Databricks notebooks and manage clusters
The rigors of integrating Databricks with Azure Storage, Azure SQL Database and Power BI
How to write Python code for both analytics and machine learning
Cool new Databricks features, like Delta Lake, Delta Engine and MLflow

300+ sessions are now available on-demand from Data Platform Summit 2021 & 2020 at no cost. Browse all sessions.

Stay tuned, more learning coming your way.