Engineering Data Solutions

👋 Welcome to my page!

I am Jonathan, a Data Engineer based in Munich, Germany. Here I share little snippets of things that I learn, find interesting, or worth discussing.

I will post mostly about topics that relate to the practice of data engineering. These might be centered around Microsoft Azure and open source technology such as Airflow, DuckDB, or dlt. I also plan to publish posts around data governance, management, and strategy as I firmly believe that data (engineering) projects are most successful, if they are truly part of an organization’s broader business strategy.

At some point you’ll find some of my personal projects here too (once those are ready to share 😉) - from a recipe management web server written in Go to a Serverless ETL Platform build on GitHub Actions to a Kubernetes-powered data platform for my personal finances.

You can find my socials and recent posts below and check the “About” page in case you’d like to know more about me!

Home-Plumbing my personal data platform

This last weekend I achieved the milestone I had my eyes on since May 2025! I finally deployed Airflow 3 (a workflow orchestration tool) in my own Kubernetes cluster running on K3s (lightweight Kubernetes) and Flux (GitOps automation). The Airflow deployment runs my custom Airflow image and my DAGs orchestrate the code from my own home-plumbing python package. This deployment marks the “go live” of my personal data platform which I jokingly named home plumbing - a nod toward the messy plumbing involved in the day-to-day work I face as a data engineer. ...

Manually push to GitHub Container registry with Docker login

Today I wanted to manually push a Docker image I built locally to my github container registry for debugging purposes. I built the image like so docker build --pull --tag ghcr.io/jonathanschwarzhaupt/lab-airflow:v0.0.1 . Important to not forget the dot in the end to indicate that the Dockerfile resides in the current directory. I had issues pushing the image to ghcr.io, however. I confirmed that I was logged in as the correct user through ...

First impressions with Claude Code CLI

I’ve been curious about AI-powered development tools for a while now, and recently decided to give Anthropic’s Claude Code CLI a proper test drive. After a few hours of experimenting with it on my lab repository, I wanted to share some initial thoughts on what it’s good at and where it might fit into my workflow. What I tested I threw a couple of realistic tasks at Claude to see how it handles common development scenarios: ...

First steps with Harlequin

Today I used Harlequin for the first time. It is a SQL IDE for the terminal and the setup or installation was surprisingly easy. I tried out the tool as part of my “elt-on-github-actions” repository (more information on that will follow). I wanted to test the conditional logic in my marts models: If it is the production environment, output to blob storage and if it is not, then create a view in the duckdb database. ...

Exploring Project Nessie - a transactional catalogue over iceberg tables

This week, I dove into Project Nessie - an open-source transactional data catalogue for Apache Iceberg tables. I’d heard about Nessie’s git-like semantics and was curious about its potential for better managing data versioning and auditability in my projects. Docker compose setup for Nessie Server and CLI To experiment locally, I leveraged Docker, conveniently supported by a guide provided by the Nessie team. Following their materials, I put together a straightforward Docker Compose file that neatly places both the Nessie server and CLI into the same Docker network. This setup greatly simplifies communication between the containers. ...