Set up the PySpark interactive environment for Visual Studio Code

The following steps show how to set up the PySpark interactive environment in VSCode. This step is only for non-Windows users.

We use python/pip command to build virtual environment in your Home path. If you want to use another version, you need to change default version of python/pip command manually. More details see update-alternatives.

  1. Install Python and pip.

    • Install Python from https://www.python.org/downloads/.

    • Install pip from https://pip.pypa.io/en/stable/installing (if it's not installed from the Python installation).

    • Optionally validate that Python and pip are installed successfully by using the commands python --version, and pip --version, respectively.

      Note

      It is recommended to manually install Python instead of using the macOS default version.

  2. Install virtualenv by running command below.

    pip install virtualenv
    

Other packages

On Linux, if you come across the error message below, then install the required packages by running the following two commands.

Install libkrb5 package for python.

sudo apt-get install libkrb5-dev
sudo apt-get install python-dev

Restart VSCode, and then go back to the VSCode editor and run Spark: PySPark Interactive command.

Next steps

Demo

  • HDInsight for VS Code: Video

Tools and extensions