How to change the Apache Spark port and add SSL Certificate

Embarking on your journey with Apache Spark? Fantastic! In this simplified guide, we’ll walk you through the process of changing the Apache Spark port to the user-friendly 80 and securing your setup with an SSL certificate.

Prerequisites

Make sure you have a clean installation of Apache Spark on your Ubuntu 23 server. Follow our Apache Spark on Ubuntu 23 installation guide.

Step 1: Acquiring an SSL Certificate

Let’s start by obtaining a free SSL certificate from Let’s Encrypt using Certbot:

sudo apt update
sudo apt install certbot
sudo certbot certonly --standalone -d spark.sysnestor.com
How to change the Apache Spark port and add SSL Certificate

Change spark.sysnestor.com with your domain. Follow the prompts to generate the SSL certificate.

How to change the Apache Spark port and add SSL Certificate

Step 2: Configuring Apache Spark for SSL

Edit your Spark configuration file (spark-defaults.conf) and add the following lines:

nano /opt/spark/python/test_coverage/conf/spark-defaults.conf
spark.ui.reverseProxy       true
spark.ui.reverseProxyUrl    https://spark.sysnestor.com
spark.ui.reverseProxyTrust  /etc/letsencrypt/live/spark.sysnestor.com/fullchain.pem

Don’t forget to change spark.sysnestor.com with your domain.

Step 3: Configure Nginx as Reverse Proxy

Install Nginx

apt install nginx

Create a new Nginx server block configuration:

sudo nano /etc/nginx/sites-available/spark

Add the following configuration:

server {
    listen 80;
    server_name spark.sysnestor.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl;
    server_name spark.sysnestor.com;

    ssl_certificate /etc/letsencrypt/live/spark.sysnestor.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/spark.sysnestor.com/privkey.pem;

    location / {
        proxy_pass http://localhost:8080;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;
    }
}

Make sure to replace spark.sysnestor.com with your own domain and adjust port 80 to your preferred port.

Enable the Nginx configuration:

sudo ln -s /etc/nginx/sites-available/spark /etc/nginx/sites-enabled
sudo service nginx restart

Step 4: Create a systemd Service:

Create a systemd service file for Apache Spark. Create a file, for example, spark.service, in the /etc/systemd/system/ directory:

[Unit]
Description=Apache Spark

[Service]
Type=simple
ExecStart=/opt/spark/sbin/start-master.sh
Restart=always
RestartSec=3

[Install]
WantedBy=default.target

Enable and Start the Service:

Run the following commands to enable and start the service:

sudo systemctl enable spark
sudo systemctl start spark

This will start the Spark master as a systemd service.

Create a script, for example, monitor_spark.sh, that checks if Spark is running and restarts it if needed:

#!/bin/bash

SPARK_PID=$(pgrep -f "org.apache.spark.deploy.master.Master")

if [ -z "$SPARK_PID" ]; then
    echo "Spark is not running. Restarting..."
    /opt/spark/sbin/start-master.sh
else
    echo "Spark is running (PID: $SPARK_PID)."
fi

Make the script executable:

chmod +x monitor_spark.sh
sh monitor_spark.sh
Spark is not running. Restarting...
starting org.apache.spark.deploy.master.Master, logging to /opt/spark/logs/spark-root-org.apache.spark.deploy.master.Master-1-sysnestor.out

Step 5: Verify SSL Configuration

Visit https://spark.sysnestor.com in your web browser. Ensure the connection is secure, and the Apache Spark web console is accessible.

How to change the Apache Spark port and add SSL Certificate

Congratulations! You’ve successfully added SSL to your Apache Spark deployment and set up a reverse proxy with Nginx. Your Spark web console is now securely accessible via https://spark.sysnestor.com.

Remember to regularly renew your SSL certificate using Certbot.