kamu focuses primarily on the problem of data management, you often may want to do some basic data exploration before exporting data for further use in your data science projects, so we decided to provide a few simple exploration tools for you to assess the state of data without leaving the comfort of one tool.
Tail Command
Usekamu tail command to quickly view a sample of last events in a dataset:
Inspect Command Group
A set ofkamu inspect * commands allows you to explore metadata and lineage of datasets. For example to display the lineage of a certain dataset in a browser use:
SQL Console
kamu sql command group provides a simple way to run ad-hoc queries and explore data using SQL language.
Following command will drop you into the SQL shell:
describe to inspect the dataset’s schema:
The extra quotes needed to treat the dataset name containing dots as a table name.
Ctrl+D to exit the SQL shell.
SQL is a widely supported language, so kamu can be used in conjunction with many other tools that support it, such as Tableau and Power BI. See integrations for details.
The kamu sql is a very powerful command that you can use both interactively or for scripting. We encourage you to explore more of its options through kamu sql --help.
Jupyter Notebooks
Kamu also connects the power of Apache Spark with the Jupyter Notebook server. You can get started by running:kamu extension:
kamu connection:
%%sql cell magic:
By default the
notebook command will use DataFusion engine. To start with Spark use:%%sql cell to a variable use:
Web UI
And finally,kamu comes with embedded Web UI that you can use to explore your pipelines and run SQL queries on data from the comfort of your browser:
You can launch it by running: