Beginner’s Bliss: Effortless Export of Delta Tables to Azure Data Lake Storage (ADLS) with PySpark

Dhruv Singhal
1 min readJul 17, 2023

--

To export a Delta table to ADLS (Azure Data Lake Storage) using PySpark in Databricks, you can use the delta method deltaLakeTable.write.format(‘delta’) to read the Delta table, and then use the save method to save it to the ADLS path.

Here’s a step-by-step guide on how to export a Delta table to ADLS using PySpark and Databricks:

  1. First, make sure you have the necessary credentials and access permissions to read the Delta table and write to ADLS.
  2. Import the necessary libraries:
from pyspark.sql import SparkSession

Create a Spark session:

spark = SparkSession.builder \
.appName("Export Delta Table to ADLS") \
.getOrCreate()

Read the Delta table:

delta_table = spark.read.format("delta").load("path/to/delta_table")

Replace “path/to/delta_table” with the path to your Delta table.

Define the destination path in ADLS where you want to save the data:

adls_path = "adl://<YOUR_ACCOUNT_NAME>.azuredatalakestore.net/path/to/destination"

Replace <YOUR_ACCOUNT_NAME> with your ADLS account name, and specify the desired destination path.

Save the Delta table to ADLS

delta_table.write.format("delta").save(adls_path)

You may also need to provide credentials and other configurations depending on your ADLS setup. If so, you can set those using the spark.conf.set method or by providing options to the write method. For example, if you’re using service principal credentials:

spark.conf.set("spark.databricks.adls.oauth2.clientId", "YOUR_CLIENT_ID")
spark.conf.set("spark.databricks.adls.oauth2.clientSecret", "YOUR_CLIENT_SECRET")
spark.conf.set("spark.databricks.adls.oauth2.accessToken", "YOUR_ACCESS_TOKEN")

Remember to replace “YOUR_CLIENT_ID”, “YOUR_CLIENT_SECRET”, and “YOUR_ACCESS_TOKEN” with your actual credentials.

That’s it! The Delta table should now be exported to the specified ADLS path.

Liked it? Follow for more updates, and share your thoughts and questions in the comments. Happy data engineering!

--

--

Dhruv Singhal
Dhruv Singhal

Written by Dhruv Singhal

Data engineer with expertise in PySpark, SQL, Flask. Skilled in Databricks, Snowflake, and Datafactory. Published articles. Passionate about tech and games.

No responses yet