How to Install Python Package in Azure Synapse for Apache Spark pools

Published: (January 6, 2026 at 04:58 PM EST)
2 min read
Source: Dev.to

Source: Dev.to

Cover image for How to Install Python Package in Azure Synapse for Apache Spark pools

Efficiently Installing Python Packages in Azure Synapse Analytics

When working in Azure Synapse notebooks, you can use the %pip command (e.g., %pip install pandas) in a code cell to install packages. However, this method is temporary: the package is only installed for the current notebook session and must be re‑installed every time the session starts. This repetition can lead to significant delays in notebook execution and is inefficient for frequently run jobs.

A more permanent and efficient solution is to install packages directly onto the Apache Spark pool. This approach ensures the libraries are pre‑installed and automatically available in every session attached to that pool.

How to Install Packages at the Spark Pool Level

This method involves uploading a requirements.txt file that specifies the packages and versions you need.

  1. Go to your Azure Synapse workspace in the Azure portal.
  2. Navigate to the Manage section on the left‑hand side.
  3. Select Apache Spark pools under the Analytics pools section.
  4. Choose the Spark pool where you want to install the package.
  5. Click the three dots on the right side of the Spark pool and select Packages.
  6. Upload the requirements.txt file that contains the list of packages you want to install.
  7. Click Apply to save the changes.

Spark pool package installation UI

The Spark pool will update and automatically install the specified packages. This may take a few minutes. Once complete, all notebooks attached to this pool will have access to these libraries by default.

How to Generate requirements.txt File

The requirements.txt file is a simple text file that lists the packages to be installed. You can easily generate this file from your local Python environment.

pip freeze > requirements.txt

This command captures all packages and their exact versions from your current environment and saves them into a file named requirements.txt. Uploading this file ensures that the exact same package versions are installed in your Synapse environment, providing consistency and preventing dependency conflicts.

Back to Blog

Related posts

Read more »

WTF is Distributed Chaos Engineering?

What is Distributed Chaos Engineering? Distributed Chaos Engineering is a way to test how well a complex, distributed system e.g., a cloud service composed of...

How AWS re:Invented the cloud

From the floor at AWS re:Invent, Ryan is joined by AWS Senior Principal Engineer David Yanacek to chat about all things AWS, from the truth behind AWS’s Black F...