Containerising Data Pipeline Components

The last post in this series covered some simple Python code that leveraged twitter’s tweepy API in order to obtain tweets based on a query, sentiment score each tweet and then load these into an […]
The last post in this series covered some simple Python code that leveraged twitter’s tweepy API in order to obtain tweets based on a query, sentiment score each tweet and then load these into an […]
In my last post I outlined a number of architectural options for solutions that could be implemented in light of Microsoft retiring SQL Server 2019 Big Data Clusters, one of which was data pipelines that […]
kubectl is the defacto command line tool for administering Kubernetes clusters. Connecting to a cluster via kubectl requires a Kubernetes config file, this in turn contains one or more contexts. A context is simply a […]
TL;DR This post presents some high-level architectural ideas for implementing Data Lakes using SQL Server 2022, specifically SQL Server 2022 S3 data virtualisation. Whilst SQL Server 2022 is under NDA, this post and subsequent posts […]
Someone I know had worked at an organization that needed to scale out their OpenShift clusters/footprint, they were constrained by the speed of their procurement department and were wondering if they could get by with […]
Part seven of this series focuses on deploying an Azure Arc enabled Data Services controller to a Kubernetes cluster. As per the closing comments of the last blog post, PX Backup will be covered in […]
Part six of this series will focus on deploying a storage solution to our Kubernetes cluster: Where Were We ? If you have been following this blog post series you should have: a basic grasp […]
Our journey up the stack brings us to the installation of MetalLB – a software load balancer for Kubernetes: All the content in this series of blog posts relates to the Arc-PX-VMware-Faststart repo on GitHub, […]
In the last post, part 3 of this series – we started off at the bottom of the stack with the Terraform module for virtual machine creation. We continue our journey up the stack in […]
Part 3 of this series will begin the journey up the stack, starting with the deployment of the virtual machines that will host the Kubernetes cluster nodes: All the blog posts in this series relate […]