Docker is a set of PaaS products that use OS-level virtualisation to deliver software in packages called containers. These containers are isolated from each other and bundle their own application, tools, libraries and configuration files. They can communicate with each other through well-defined channels.
The containers are lightweight because they don't require the additional load of a hypervisor, but run directly in the host machine's kernel. This means that you can run more containers on a given hardware combination than if you were using virtual machines.
Kubernetes, also known as K8s, is an open-source system for automating deployment, scaling, and management of containerised applications.
It groups containers that make up an application into logical units for easy management and discovery. Kubernetes builds upon 15 years of experience of running production workloads at Google, combined with best-of-breed ideas and practices from the community.
Apache Kafka is an open-source stream-processing software platform developed by the Apache Software Foundation, written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds.
DBT (Data Building Tool) is a command-line tool that enables data analysts and engineers to transform data in their warehouses simply by writing select statements.
DBT performs the T (Transform) of ETL but it doesn’t offer support for Extraction and Load operations. It allows companies to write transformations as queries and orchestrate them in a more efficient way.
PostgreSQL is a powerful, open source object-relational database system that uses and extends the SQL language combined with many features that safely store and scale the most complicated data workloads.
Apache Airflow is an open-source workflow management platform.
Airflow is written in Python, and workflows are created via Python scripts.
Airflow is designed under the principle of "configuration as code". While other "configuration as code" workflow platforms exist using markup languages like XML, using Python allows developers to import libraries and classes to help them create their workflows.
The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualisations and narrative text.
Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualisation, machine learning, and much more.
Kong is a cloud-native, fast, scalable, and distributed Microservice Abstraction Layer (also known as an API Gateway or API Middleware). Made available as an open-source project in 2015, its core values are high performance and extensibility.