This section covers everything you need to know to run Kamu CLI.

    Project Status

    Documentation Our code and documentation is actively evolving, so many topics (those in lighter gray color) have not been covered yet. While some formal documentation is missing, we are focusing on prividing good quality examples, tutorials, and reference documentation, so you can learn a lot from those. Project Status Disclaimer kamu is at the MVP stage of maturity, but have not reached a stable release yet. Before v1.0 we reserve the right to break compatibility between the releases.

    Self-Serve Demo

    This demo guides you through the basics of using kamu and its key concepts. It lets you try out most of the tool’s features without having to install it. The demo is also available online at: https://demo.kamu.dev (for financial reasons the capacity of this environment is limited) Requirements To run this demo you’ll only need: docker docker-compose Running First you will need to download the docker-compose.

    Installation

    Covers installation steps to get kamu-cli running on your computer

    First Steps

    For a quick overview of key functionality you can also view this tutorial: This tutorial will give you a high-level tour of kamu and show you how it works through examples. We assume that you have already followed the installation steps and have kamu tool ready. Not ready to install just yet? Try kamu in this self-serve demo without needing to install anything. Don’t forget to set up shell completions - they make kamu a lot more fun to use!

    FAQ

    How does kamu compare to Spark / Flink / Kafka Streams (or other enterprise data processing tech)? kamu does not compete with enterprise data processing technologies - it builds on top of them: It specifies how data should be stored (e.g. making sure that data is never modified or deleted) Provides stable references to data for reproducibility Specifies how data & metadata are shared Tracks every processing step executed, so that a person on another side of the world who downloaded your dataset could understand exactly where every single piece of data came from Allows you to evolve your processing steps over time without breaking other people’s pipelines that consume your data And much more… So Spark and Flink to kamu is just a building block, while kamu is a higher level and opinionated tool.