Config Reference
CLIConfig
| Field | Type | Default | Description |
|---|---|---|---|
auth | AuthConfig | | Auth configuration |
database | DatabaseConfig | null | Database connection configuration |
datasetEnvVars | DatasetEnvVarsConfig | | Dataset environment variables configuration |
didEncryption | DidSecretEncryptionConfig | | Did secret key encryption configuration |
engine | EngineConfig | | Engine configuration |
extra | ExtraConfig | | Experimental and temporary configuration options |
flowSystem | FlowSystemConfig | | Configuration for flow system |
frontend | FrontendConfig | | Data access and visualization configuration |
identity | IdentityConfig | {} | UNSTABLE: Identity configuration |
outbox | OutboxConfig | | Messaging outbox configuration |
protocol | ProtocolConfig | | Network protocols configuration |
quotaDefaults | QuotaDefaults | | Default quotas configured by type |
search | SearchConfig | | Search configuration |
source | SourceConfig | | Source configuration |
uploads | UploadsConfig | | Uploads configuration |
webhooks | WebhooksConfig | | Configuration for webhooks |
AccountConfig
| Field | Type | Default | Description |
|---|---|---|---|
accountName | AccountName | ||
accountType | AccountType | "User" | |
avatarUrl | string | null | |
displayName | string | null | Auto-derived from |
email | Email | ||
id | AccountID | null | Auto-derived from |
password | Password | ||
properties | array | [] | |
provider | string | "password" | |
registeredAt | string | null | |
treatDatasetsAsPublic | boolean | false |
AccountID
Base type: string
AccountName
Base type: string
AccountPropertyName
| Variants |
|---|
CanProvisionAccounts |
Admin |
AccountType
| Variants |
|---|
User |
Organization |
AuthConfig
| Field | Type | Default | Description |
|---|---|---|---|
allowAnonymous | boolean | true | |
users | PredefinedAccountsConfig | |
ContainerRuntimeType
| Variants |
|---|
Docker |
Podman |
DatabaseConfig
| Variants |
|---|
Sqlite |
Postgres |
MySql |
MariaDB |
DatabaseConfig::Sqlite
| Field | Type | Default | Description |
|---|---|---|---|
databasePath | string | ||
provider | string |
DatabaseConfig::Postgres
| Field | Type | Default | Description |
|---|---|---|---|
acquireTimeoutSecs | integer | null | |
credentialsPolicy | DatabaseCredentialsPolicyConfig | ||
databaseName | string | ||
host | string | ||
maxConnections | integer | null | |
maxLifetimeSecs | integer | null | |
port | integer | null | |
provider | string |
DatabaseConfig::MySql
| Field | Type | Default | Description |
|---|---|---|---|
acquireTimeoutSecs | integer | null | |
credentialsPolicy | DatabaseCredentialsPolicyConfig | ||
databaseName | string | ||
host | string | ||
maxConnections | integer | null | |
maxLifetimeSecs | integer | null | |
port | integer | null | |
provider | string |
DatabaseConfig::MariaDB
| Field | Type | Default | Description |
|---|---|---|---|
acquireTimeoutSecs | integer | null | |
credentialsPolicy | DatabaseCredentialsPolicyConfig | ||
databaseName | string | ||
host | string | ||
maxConnections | integer | null | |
maxLifetimeSecs | integer | null | |
port | integer | null | |
provider | string |
DatabaseCredentialSourceConfig
| Variants |
|---|
RawPassword |
AwsSecret |
AwsIamToken |
DatabaseCredentialSourceConfig::RawPassword
| Field | Type | Default | Description |
|---|---|---|---|
kind | string | ||
rawPassword | string | ||
userName | string |
DatabaseCredentialSourceConfig::AwsSecret
| Field | Type | Default | Description |
|---|---|---|---|
kind | string | ||
secretName | string |
DatabaseCredentialSourceConfig::AwsIamToken
| Field | Type | Default | Description |
|---|---|---|---|
kind | string | ||
userName | string |
DatabaseCredentialsPolicyConfig
| Field | Type | Default | Description |
|---|---|---|---|
rotationFrequencyInMinutes | integer | null | |
source | DatabaseCredentialSourceConfig |
DatasetEnvVarsConfig
| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | false | |
encryptionKey | string | null | Represents the encryption key for the dataset env vars. This field is
required if The encryption key must be a 32-character alphanumeric string, which includes both uppercase and lowercase Latin letters (A-Z, a-z) and digits (0-9). To generate use: |
DidSecretEncryptionConfig
| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | false | |
encryptionKey | string | null | The encryption key must be a 32-character alphanumeric string, which includes both uppercase and lowercase Latin letters (A-Z, a-z) and digits (0-9). To generate use: |
DurationString
Base type: string
Email
Base type: string
EmbeddingsChunkerConfig
| Variants |
|---|
Simple |
EmbeddingsChunkerConfig::Simple
| Field | Type | Default | Description |
|---|---|---|---|
kind | string | ||
splitParagraphs | boolean | false | |
splitSections | boolean | false |
EmbeddingsEncoderConfig
| Variants |
|---|
Dummy |
OpenAi |
EmbeddingsEncoderConfig::Dummy
| Field | Type | Default | Description |
|---|---|---|---|
kind | string |
EmbeddingsEncoderConfig::OpenAi
| Field | Type | Default | Description |
|---|---|---|---|
apiKey | string | null | |
dimensions | integer | 1536 | |
kind | string | ||
modelName | string | "text-embedding-ada-002" | |
url | string | null |
EngineConfig
| Field | Type | Default | Description |
|---|---|---|---|
datafusionEmbedded | EngineConfigDatafusion | | Embedded Datafusion engine configuration |
images | EngineImagesConfig | | UNSTABLE: Default engine images |
maxConcurrency | integer | null | Maximum number of engine operations that can be performed concurrently |
networkNs | NetworkNamespaceType | "Private" | Type of the networking namespace (relevant when running in container environments) |
runtime | ContainerRuntimeType | "Docker" | Type of the runtime to use when running the data processing engines |
shutdownTimeout | DurationString | "5s" | Timeout for waiting the engine container to stop gracefully |
startTimeout | DurationString | "30s" | Timeout for starting an engine container |
EngineConfigDatafusion
| Field | Type | Default | Description |
|---|---|---|---|
base | object | | Base configuration options
See: |
batchQuery | object | {} | Batch query-specific overrides to the base config |
compaction | object | | Compaction-specific overrides to the base config |
ingest | object | | Ingest-specific overrides to the base config |
useLegacyArrowBufferEncoding | boolean | false | Makes arrow batches use contiguous See: kamu-node#277 |
EngineImagesConfig
| Field | Type | Default | Description |
|---|---|---|---|
datafusion | string | "ghcr.io/kamu-data/engine-datafusion:0.9.0" | UNSTABLE: |
flink | string | "ghcr.io/kamu-data/engine-flink:0.18.2-flink_1.16.0-scala_2.12-java8" | UNSTABLE: |
risingwave | string | "ghcr.io/kamu-data/engine-risingwave:0.2.0-risingwave_1.7.0-alpha" | UNSTABLE: |
spark | string | "ghcr.io/kamu-data/engine-spark:0.23.1-spark_3.5.0" | UNSTABLE: |
EthRpcEndpoint
| Field | Type | Default | Description |
|---|---|---|---|
chainId | integer | ||
chainName | string | ||
nodeUrl | string |
EthereumSourceConfig
| Field | Type | Default | Description |
|---|---|---|---|
commitAfterBlocksScanned | integer | 1000000 | Forces iteration to stop after the specified number of blocks were scanned even if we didn’t reach the target record number. This is useful to not lose a lot of scanning progress in case of an RPC error. |
getLogsBlockStride | integer | 100000 | Default number of blocks to scan within one query to |
rpcEndpoints | array | [] | Default RPC endpoints to use if source does not specify one explicitly. |
useBlockTimestampFallback | boolean | false | Many providers don’t yet return |
ExtraConfig
| Field | Type | Default | Description |
|---|---|---|---|
graphql | GqlConfig | {} |
FlightSqlConfig
| Field | Type | Default | Description |
|---|---|---|---|
allowAnonymous | boolean | true | Whether clients can authenticate as 'anonymous' user |
anonSessionExpirationTimeout | DurationString | "30m" | Time after which |
anonSessionInactivityTimeout | DurationString | "5s" | Time after which |
authedSessionExpirationTimeout | DurationString | "30m" | Time after which |
authedSessionInactivityTimeout | DurationString | "5s" | Time after which |
FlowAgentConfig
| Field | Type | Default | Description |
|---|---|---|---|
awaitingStepSecs | integer | 1 | |
defaultRetryPolicies | object | {} | |
mandatoryThrottlingPeriodSecs | integer | 60 |
FlowSystemConfig
| Field | Type | Default | Description |
|---|---|---|---|
flowAgent | FlowAgentConfig | | |
flowSystemEventAgent | FlowSystemEventAgentConfig | | |
taskAgent | TaskAgentConfig | |
FlowSystemEventAgentConfig
| Field | Type | Default | Description |
|---|---|---|---|
batchSize | integer | 20 | |
maxListeningTimeoutMs | integer | 2000 | |
minDebounceIntervalMs | integer | 100 |
FrontendConfig
| Field | Type | Default | Description |
|---|---|---|---|
jupyter | JupyterConfig | | Integrated Jupyter notebook configuration |
GqlConfig
| Field | Type | Default | Description |
|---|
HttpSourceConfig
| Field | Type | Default | Description |
|---|---|---|---|
connectTimeout | DurationString | "30s" | Timeout for the connect phase of the HTTP client |
maxRedirects | integer | 10 | Maximum number of redirects to follow |
userAgent | string | "kamu-cli/0.260.1" | Value to use for User-Agent header |
IdentityConfig
| Field | Type | Default | Description |
|---|---|---|---|
privateKey | PrivateKey | null | Private key used to sign API responses.
Currently only To generate use: The command above:
|
IpfsConfig
| Field | Type | Default | Description |
|---|---|---|---|
httpGateway | string | "http://localhost:8080/" | HTTP Gateway URL to use for downloads.
For safety, it defaults to |
preResolveDnslink | boolean | true | Whether kamu should pre-resolve IPNS |
JupyterConfig
| Field | Type | Default | Description |
|---|---|---|---|
image | string | "ghcr.io/kamu-data/jupyter:0.7.1" | Jupyter notebook server image |
livyImage | string | "ghcr.io/kamu-data/engine-spark:0.23.1-spark_3.5.0" | UNSTABLE: Livy + Spark server image |
MqttSourceConfig
| Field | Type | Default | Description |
|---|---|---|---|
brokerIdleTimeout | DurationString | "1s" | Time in milliseconds to wait for MQTT broker to send us some data after which we will consider that we have “caught up” and end the polling loop. |
NetworkNamespaceType
Corresponds to podman’s containers.conf::netns
We podman is used inside containers (e.g. podman-in-docker or podman-in-k8s)
it usually runs uses host network namespace.
| Variants |
|---|
Private |
Host |
OutboxConfig
| Field | Type | Default | Description |
|---|---|---|---|
awaitingStepSecs | integer | 1 | |
batchSize | integer | 20 |
Password
Base type: string
PredefinedAccountsConfig
| Field | Type | Default | Description |
|---|---|---|---|
predefined | array | [] |
PrivateKey
Base type: string
ProtocolConfig
| Field | Type | Default | Description |
|---|---|---|---|
flightSql | FlightSqlConfig | |
|
ipfs | IpfsConfig | | IPFS configuration |
QuotaDefaults
| Field | Type | Default | Description |
|---|---|---|---|
storage | integer | 1000000000 |
RetryPolicyConfig
| Field | Type | Default | Description |
|---|---|---|---|
backoffType | RetryPolicyConfigBackoffType | "Fixed" | |
maxAttempts | integer | 0 | |
minDelaySecs | integer | 0 |
RetryPolicyConfigBackoffType
| Variants |
|---|
Fixed |
Linear |
Exponential |
ExponentialWithJitter |
SearchConfig
| Field | Type | Default | Description |
|---|---|---|---|
embeddingsChunker | EmbeddingsChunkerConfig | | Embeddings chunker configuration |
embeddingsEncoder | EmbeddingsEncoderConfig | | Embeddings encoder configuration |
indexer | SearchIndexerConfig | | Indexer configuration |
repo | SearchRepositoryConfig | | Search repository configuration |
SearchIndexerConfig
| Field | Type | Default | Description |
|---|---|---|---|
clearOnStart | boolean | false | Whether to clear and re-index on start or use existing vectors if any |
incrementalIndexing | boolean | true | Whether incremental indexing is enabled |
SearchRepositoryConfig
| Variants |
|---|
Dummy |
Elasticsearch |
ElasticsearchContainer |
SearchRepositoryConfig::Dummy
| Field | Type | Default | Description |
|---|---|---|---|
kind | string |
SearchRepositoryConfig::Elasticsearch
| Field | Type | Default | Description |
|---|---|---|---|
caCertPemPath | string | null | |
embeddingDimensions | integer | 1536 | |
enableCompression | boolean | false | |
indexPrefix | string | "" | |
kind | string | ||
password | string | null | |
timeoutSecs | integer | 30 | |
url | string | "http://localhost:9200/" |
SearchRepositoryConfig::ElasticsearchContainer
| Field | Type | Default | Description |
|---|---|---|---|
embeddingDimensions | integer | 1536 | |
image | string | "docker.io/elasticsearch:9.2.1" | |
kind | string | ||
startTimeout | DurationString | "30s" |
SourceConfig
| Field | Type | Default | Description |
|---|---|---|---|
ethereum | EthereumSourceConfig | | Ethereum-specific configuration |
http | HttpSourceConfig | | HTTP-specific configuration |
mqtt | MqttSourceConfig | | MQTT-specific configuration |
targetRecordsPerSlice | integer | 10000 | Target number of records after which we will stop consuming from the resumable source and commit data, leaving the rest for the next iteration. This ensures that one data slice doesn’t become too big. |
TaskAgentConfig
| Field | Type | Default | Description |
|---|---|---|---|
checkingIntervalSecs | integer | 1 |
UploadsConfig
| Field | Type | Default | Description |
|---|---|---|---|
maxFileSizeInMb | integer | 50 |
WebhooksConfig
| Field | Type | Default | Description |
|---|---|---|---|
deliveryTimeout | integer | 10 | |
maxConsecutiveFailures | integer | 5 | |
secretEncryptionEnabled | boolean | false | |
secretEncryptionKey | string | null | Represents the encryption key for the webhooks secret. This field is
required if The encryption key must be a 32-character alphanumeric string, which includes both uppercase and lowercase Latin letters (A-Z, a-z) and digits (0-9). Example |