Apache Superset
Apache Superset is an modern open-source data exploration and visualization platform. It can be connected to kamu
via native Arrow Flight SQL protocol using Python client.
In this setup:
kamu
runs a Flight SQL server - a high-performance protocol for data transfer- In the
Superset
environment we install an additionalflightsql-dbapi
Python package Superset
uses generic database API provided bySQLAlchemy
frameworkflightsql-dbapi
package provides custom engine implementation forSQLAlchemy
that translates allSuperset
’s queries into Flight SQL protocol
To connect Superset
to kamu
follow these simple steps:
- Start with being able to run
Superset
locally usingdocker-compose
(see this official guide)- Rest of the guide assumes that you are launching superset in “non-dev” mode using:
docker-compose -f docker-compose-non-dev.yml up
- Rest of the guide assumes that you are launching superset in “non-dev” mode using:
- Install flightsql-dbapi Python package into
Superset
container:- Stop and clean up the environment:
docker-compose -f docker-compose-non-dev.yml down
- Create
<superset repo>/docker/requirements-local.txt
file (as per this guide) with the following contents:# At the time of this writing Superset used arrow version with a critical to us bug pyarrow==13.0.0 flightsql-dbapi==0.2.1
- Stop and clean up the environment:
- (Optional) Specify your MabBox API Token in
<superset repo>/docker/.env-non-dev
- Run
kamu
Flight SQL server in a desired workspace:kamu sql server --address 0.0.0.0 --port 50050
- Start
Superset
viadocker-compose
again - Create a new database connection in
Superset
- Use
"Other"
database kind - As URL specify:
datafusion+flightsql://anonymous:anonymous@<hostname or IP>:50050?insecure=True
- Use
- Skip
insecure=True
when node is set up with TLS - To authenticate via access token use:
datafusion+flightsql://<hostname or IP>:50050?token=<TOKEN>