Apache Superset
Apache Superset is an modern open-source data exploration and visualization platform. It can be connected to kamu via native Arrow Flight SQL protocol using Python client.
In this setup:
kamuruns a Flight SQL server - a high-performance protocol for data transfer- In the
Supersetenvironment we install an additionalflightsql-dbapiPython package Supersetuses generic database API provided bySQLAlchemyframeworkflightsql-dbapipackage provides custom engine implementation forSQLAlchemythat translates allSuperset’s queries into Flight SQL protocol
To connect Superset to kamu follow these simple steps:
- Start with being able to run
Supersetlocally usingdocker-compose(see this official guide)- Rest of the guide assumes that you are launching superset in “non-dev” mode using:
docker-compose -f docker-compose-non-dev.yml up
- Rest of the guide assumes that you are launching superset in “non-dev” mode using:
- Install flightsql-dbapi Python package into
Supersetcontainer:- Stop and clean up the environment:
docker-compose -f docker-compose-non-dev.yml down - Create
<superset repo>/docker/requirements-local.txtfile (as per this guide) with the following contents:# At the time of this writing Superset used arrow version with a critical to us bug pyarrow==13.0.0 flightsql-dbapi==0.2.1
- Stop and clean up the environment:
- (Optional) Specify your MabBox API Token in
<superset repo>/docker/.env-non-dev - Run
kamuFlight SQL server in a desired workspace:kamu sql server --address 0.0.0.0 --port 50050 - Start
Supersetviadocker-composeagain - Create a new database connection in
Superset- Use
"Other"database kind - As URL specify:
datafusion+flightsql://anonymous:anonymous@<hostname or IP>:50050?insecure=True
- Use
- Skip
insecure=Truewhen node is set up with TLS - To authenticate via access token use:
datafusion+flightsql://<hostname or IP>:50050?token=<TOKEN>