ApiServerConfig
| Field | Type | Default | Description |
|---|---|---|---|
auth | AuthConfig | | Authentication & authorization |
database | DatabaseConfig | | Database |
datasetEnvVars | DatasetEnvVarsConfig | | Dataset environment variable feature |
email | EmailConfig | | Email gateway configuration |
engine | EngineConfig | | Ingest and transform engines |
extra | ExtraConfig | | Experimental and temporary module configuration |
flowSystem | FlowSystemConfig | | Configuration for the flow system |
identity | IdentityConfig | null | UNSTABLE: Identity configuration |
outbox | OutboxConfig | | Outbox configuration |
protocol | ProtocolConfig | | Protocols |
quota | QuotaConfig | | Default quotas configured by type |
repo | RepoConfig | | Dataset repository |
runtime | RuntimeConfig | {} | Tokio runtime |
search | SearchConfig | | Search configuration |
source | SourceConfig | | Ingestion’s sources |
uploadRepo | UploadRepoConfig | | File upload repository |
url | UrlConfig | | External URLs |
webhooks | WebhooksConfig | | Configuration for webhooks |
AccountConfig
| Field | Type | Default | Description |
|---|---|---|---|
accountName | AccountName | ||
accountType | AccountType | ”User” | |
avatarUrl | string | null | |
displayName | string | null | Auto-derived from account_name if omitted |
email | Email | ||
id | AccountID | null | Auto-derived from account_name if omitted |
password | Password | ||
properties | array | [] | |
provider | string | ”password” | |
registeredAt | string | null | |
treatDatasetsAsPublic | boolean | false |
AccountID
Base type: string
AccountName
Base type: string
AccountPropertyName
| Variants |
|---|
CanProvisionAccounts |
Admin |
AccountType
| Variants |
|---|
User |
Organization |
AuthConfig
| Field | Type | Default | Description |
|---|---|---|---|
allowAnonymous | boolean | true | |
didEncryption | DidSecretEncryptionConfig | | |
jwtSecret | string | "" | |
passwordPolicy | PasswordPolicyConfig | | |
providers | array | [] |
AuthProviderConfig
| Variants |
|---|
Github |
Password |
AuthProviderConfig::Github
| Field | Type | Default | Description |
|---|---|---|---|
clientId | string | ||
clientSecret | string | ||
kind | string |
AuthProviderConfig::Password
| Field | Type | Default | Description |
|---|---|---|---|
accounts | array | [] | |
kind | string |
ContainerRuntimeType
| Variants |
|---|
Docker |
Podman |
DatabaseConfig
| Variants |
|---|
InMemory |
Sqlite |
Postgres |
DatabaseConfig::InMemory
| Field | Type | Default | Description |
|---|---|---|---|
provider | string |
DatabaseConfig::Sqlite
| Field | Type | Default | Description |
|---|---|---|---|
databasePath | string | ||
provider | string |
DatabaseConfig::Postgres
| Field | Type | Default | Description |
|---|---|---|---|
acquireTimeoutSecs | integer | null | |
credentialsPolicy | DatabaseCredentialsPolicyConfig | ||
databaseName | string | ||
host | string | ||
maxConnections | integer | null | |
maxLifetimeSecs | integer | null | |
port | integer | null | |
provider | string |
DatabaseCredentialSourceConfig
| Variants |
|---|
RawPassword |
AwsSecret |
AwsIamToken |
DatabaseCredentialSourceConfig::RawPassword
| Field | Type | Default | Description |
|---|---|---|---|
kind | string | ||
rawPassword | string | ||
userName | string |
DatabaseCredentialSourceConfig::AwsSecret
| Field | Type | Default | Description |
|---|---|---|---|
kind | string | ||
secretName | string |
DatabaseCredentialSourceConfig::AwsIamToken
| Field | Type | Default | Description |
|---|---|---|---|
kind | string | ||
userName | string |
DatabaseCredentialsPolicyConfig
| Field | Type | Default | Description |
|---|---|---|---|
rotationFrequencyInMinutes | integer | null | |
source | DatabaseCredentialSourceConfig |
DatasetEnvVarsConfig
| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | false | |
encryptionKey | string | null | Represents the encryption key for the dataset env vars. This field is
required if enabled is true or None.The encryption key must be a 32-character alphanumeric string, which
includes both uppercase and lowercase Latin letters (A-Z, a-z) and
digits (0-9).To generate use: |
DidSecretEncryptionConfig
| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | false | |
encryptionKey | string | null | The encryption key must be a 32-character alphanumeric string, which
includes both uppercase and lowercase Latin letters (A-Z, a-z) and
digits (0-9).To generate use: |
DurationString
Base type: string
Email
Base type: string
EmailConfig
| Field | Type | Default | Description |
|---|---|---|---|
gateway | EmailConfigGateway | ||
senderAddress | string | ||
senderName | string | null |
EmailConfigGateway
| Variants |
|---|
Dummy |
Postmark |
EmailConfigGateway::Dummy
| Field | Type | Default | Description |
|---|---|---|---|
kind | string |
EmailConfigGateway::Postmark
| Field | Type | Default | Description |
|---|---|---|---|
apiKey | string | ||
kind | string |
EmbeddingsChunkerConfig
| Variants |
|---|
Simple |
EmbeddingsChunkerConfig::Simple
| Field | Type | Default | Description |
|---|---|---|---|
kind | string | ||
splitParagraphs | boolean | false | |
splitSections | boolean | false |
EmbeddingsEncoderConfig
| Variants |
|---|
OpenAi |
Dummy |
EmbeddingsEncoderConfig::OpenAi
| Field | Type | Default | Description |
|---|---|---|---|
apiKey | string | null | |
dimensions | integer | 1536 | |
kind | string | ||
modelName | string | ”text-embedding-ada-002” | |
url | string | null |
EmbeddingsEncoderConfig::Dummy
| Field | Type | Default | Description |
|---|---|---|---|
kind | string |
EngineConfig
| Field | Type | Default | Description |
|---|---|---|---|
datafusionEmbedded | EngineConfigDatafusion | | Embedded Datafusion engine configuration |
images | EngineImagesConfig | | UNSTABLE: Default engine images |
maxConcurrency | integer | null | Maximum number of engine operations that can be performed concurrently |
networkNs | NetworkNamespaceType | ”Private” | Type of the networking namespace (relevant when running in container environments) |
runtime | ContainerRuntimeType | ”Podman” | Type of the runtime to use when running the data processing engines |
shutdownTimeout | DurationString | ”5s” | Timeout for waiting the engine container to stop gracefully |
startTimeout | DurationString | ”30s” | Timeout for starting an engine container |
EngineConfigDatafusion
| Field | Type | Default | Description |
|---|---|---|---|
base | object | | Base configuration options
See: <https://datafusion.apache.org/user-guide/configs.html> |
batchQuery | object | {} | Batch query-specific overrides to the base config |
compaction | object | | Compaction-specific overrides to the base config |
ingest | object | | Ingest-specific overrides to the base config |
useLegacyArrowBufferEncoding | boolean | false | Makes arrow batches use contiguous Binary and Utf8 encodings instead
of more modern BinaryView and Utf8View. This is only needed for
compatibility with some older libraries that don’t yet support them.See: kamu-node#277 |
EngineImagesConfig
| Field | Type | Default | Description |
|---|---|---|---|
datafusion | string | ”ghcr.io/kamu-data/engine-datafusion:0.9.0” | UNSTABLE: Datafusion engine image |
flink | string | ”ghcr.io/kamu-data/engine-flink:0.18.2-flink_1.16.0-scala_2.12-java8” | UNSTABLE: Flink engine image |
risingwave | string | ”ghcr.io/kamu-data/engine-risingwave:0.2.0-risingwave_1.7.0-alpha” | UNSTABLE: RisingWave engine image |
spark | string | ”ghcr.io/kamu-data/engine-spark:0.23.1-spark_3.5.0” | UNSTABLE: Spark engine image |
EthRpcEndpoint
| Field | Type | Default | Description |
|---|---|---|---|
chainId | integer | ||
chainName | string | ||
nodeUrl | string |
EthereumSourceConfig
| Field | Type | Default | Description |
|---|---|---|---|
commitAfterBlocksScanned | integer | 1000000 | Forces iteration to stop after the specified number of blocks were scanned even if we didn’t reach the target record number. This is useful to not lose a lot of scanning progress in case of an RPC error. |
getLogsBlockStride | integer | 100000 | Default number of blocks to scan within one query to eth_getLogs RPC
endpoint. |
rpcEndpoints | array | [] | Default RPC endpoints to use if source does not specify one explicitly. |
useBlockTimestampFallback | boolean | false | Many providers don’t yet return blockTimestamp from eth_getLogs RPC
endpoint and in such cases block_timestamp column will be null.
If you enable this fallback the library will perform additional call to
eth_getBlock to populate the timestam, but this may result in
significant performance penalty when fetching many log records.See: ethereum/execution-apis#295 |
ExtraConfig
| Field | Type | Default | Description |
|---|---|---|---|
graphql | GqlConfig | {} |
FlightSqlConfig
| Field | Type | Default | Description |
|---|---|---|---|
allowAnonymous | boolean | true | Whether clients can authenticate as ‘anonymous’ user |
anonSessionExpirationTimeout | DurationString | ”5m” | Time after which FlightSQL client session will be forgotten and client will have to re-authroize (for anonymous clients) |
anonSessionInactivityTimeout | DurationString | ”5s” | Time after which FlightSQL session context will be released to free the resources (for anonymous clients) |
authedSessionExpirationTimeout | DurationString | ”30m” | Time after which FlightSQL client session will be forgotten and client will have to re-authroize (for authenticated clients) |
authedSessionInactivityTimeout | DurationString | ”5s” | Time after which FlightSQL session context will be released to free the resources (for authenticated clients) |
FlowAgentConfig
| Field | Type | Default | Description |
|---|---|---|---|
awaitingStepSecs | integer | 1 | |
defaultRetryPolicies | object | {} | |
mandatoryThrottlingPeriodSecs | integer | 60 |
FlowSystemConfig
| Field | Type | Default | Description |
|---|---|---|---|
flowAgent | FlowAgentConfig | | |
flowSystemEventAgent | FlowSystemEventAgentConfig | | |
taskAgent | TaskAgentConfig | |
FlowSystemEventAgentConfig
| Field | Type | Default | Description |
|---|---|---|---|
batchSize | integer | 100 | |
maxListeningTimeoutMs | integer | 60000 | |
minDebounceIntervalMs | integer | 100 |
GqlConfig
| Field | Type | Default | Description |
|---|
IdentityConfig
| Field | Type | Default | Description |
|---|---|---|---|
privateKey | PrivateKey | null | Private key used to sign API responses.
Currently only ed25519 keys are supported.To generate use:dd if=/dev/urandom bs=1 count=32 status=none |
base64 -w0 |
tr ’+/’ ’-_’ |
tr -d ’=’ |
(echo -n u && cat)The command above:
|
IpfsConfig
| Field | Type | Default | Description |
|---|---|---|---|
httpGateway | string | ”http://localhost:8080/” | HTTP Gateway URL to use for downloads.
For safety, it defaults to http://localhost:8080 - a local IPFS daemon.
If you don’t have IPFS installed, you can set this URL to
one of the public gateways like https://ipfs.io.
List of public gateways can be found here: https://ipfs.github.io/public-gateway-checker/ |
preResolveDnslink | boolean | true | Whether kamu should pre-resolve IPNS DNSLink names using DNS or leave it to the Gateway. |
MqttSourceConfig
| Field | Type | Default | Description |
|---|---|---|---|
brokerIdleTimeoutMs | integer | 1000 | Time in milliseconds to wait for MQTT broker to send us some data after which we will consider that we have “caught up” and end the polling loop. |
NetworkNamespaceType
Corresponds to podman’s containers.conf::netns
We podman is used inside containers (e.g. podman-in-docker or podman-in-k8s)
it usually runs uses host network namespace.
| Variants |
|---|
Private |
Host |
OutboxConfig
| Field | Type | Default | Description |
|---|---|---|---|
awaitingStepSecs | integer | 1 | |
batchSize | integer | 20 |
Password
Base type: string
PasswordPolicyConfig
| Field | Type | Default | Description |
|---|---|---|---|
minNewPasswordLength | integer | 8 |
PrivateKey
Base type: string
ProtocolConfig
| Field | Type | Default | Description |
|---|---|---|---|
flightSql | FlightSqlConfig | | FlightSQL configuration |
ipfs | IpfsConfig | | IPFS configuration |
QuotaAccountConfig
| Field | Type | Default | Description |
|---|---|---|---|
defaultStorageLimitInBytes | integer | null |
QuotaConfig
| Field | Type | Default | Description |
|---|---|---|---|
account | QuotaAccountConfig | {} |
RepoCachingConfig
| Field | Type | Default | Description |
|---|---|---|---|
metadataLocalFsCachePath | string | null | |
registryCacheEnabled | boolean | false |
RepoConfig
| Field | Type | Default | Description |
|---|---|---|---|
caching | RepoCachingConfig | | |
dataBlocksPageSize | integer | null | |
repoUrl | UrlOrPath | null |
RetryPolicyConfig
| Field | Type | Default | Description |
|---|---|---|---|
backoffType | RetryPolicyConfigBackoffType | null | |
maxAttempts | integer | null | |
minDelaySecs | integer | null |
RetryPolicyConfigBackoffType
| Variants |
|---|
Fixed |
Linear |
Exponential |
ExponentialWithJitter |
RuntimeConfig
| Field | Type | Default | Description |
|---|---|---|---|
maxBlockingThreads | integer | null | |
threadStackSize | integer | null | |
workerThreads | integer | null |
SearchConfig
| Field | Type | Default | Description |
|---|---|---|---|
embeddingsChunker | EmbeddingsChunkerConfig | | Embeddings chunker configuration |
embeddingsEncoder | EmbeddingsEncoderConfig | | Embeddings encoder configuration |
indexer | SearchIndexerConfig | | Indexer configuration |
repo | SearchRepositoryConfig | | Search repository configuration |
semanticSearchThresholdScore | number | 0.0 |
SearchIndexerConfig
| Field | Type | Default | Description |
|---|---|---|---|
clearOnStart | boolean | false | |
incrementalIndexing | boolean | false | Whether incremental indexing is enabled |
SearchRepositoryConfig
| Variants |
|---|
Dummy |
Elasticsearch |
SearchRepositoryConfig::Dummy
| Field | Type | Default | Description |
|---|---|---|---|
kind | string |
SearchRepositoryConfig::Elasticsearch
| Field | Type | Default | Description |
|---|---|---|---|
caCertPemPath | string | null | |
embeddingDimensions | integer | 1536 | |
enableCompression | boolean | false | |
indexPrefix | string | ”kamu-node” | |
kind | string | ||
password | string | null | |
timeoutSecs | integer | 30 | |
url | string | ”http://localhost:9200” |
SourceConfig
| Field | Type | Default | Description |
|---|---|---|---|
ethereum | EthereumSourceConfig | | Ethereum-specific configuration |
mqtt | MqttSourceConfig | | MQTT-specific configuration |
targetRecordsPerSlice | integer | 10000 | Target number of records after which we will stop consuming from the resumable source and commit data, leaving the rest for the next iteration. This ensures that one data slice doesn’t become too big. |
TaskAgentConfig
| Field | Type | Default | Description |
|---|---|---|---|
taskCheckingIntervalSecs | integer | 1 |
UploadRepoConfig
| Field | Type | Default | Description |
|---|---|---|---|
maxFileSizeMb | integer | 50 | |
storage | UploadRepoStorageConfig | |
UploadRepoStorageConfig
| Variants |
|---|
S3 |
Local |
UploadRepoStorageConfig::S3
| Field | Type | Default | Description |
|---|---|---|---|
bucketS3Url | string | ||
kind | string |
UploadRepoStorageConfig::Local
| Field | Type | Default | Description |
|---|---|---|---|
kind | string |
UrlConfig
| Field | Type | Default | Description |
|---|---|---|---|
baseUrlFlightsql | UrlOrPath | ”grpc://localhost:50050” | |
baseUrlPlatform | UrlOrPath | ”http://localhost:4200/“ | |
baseUrlRest | UrlOrPath | ”http://localhost:8080/“ |
UrlOrPath
Base type: string
WebhooksConfig
| Field | Type | Default | Description |
|---|---|---|---|
deliveryTimeoutSecs | integer | 10 | |
maxConsecutiveFailures | integer | 5 | |
secretEncryptionEnabled | boolean | false | |
secretEncryptionKey | string | null | Represents the encryption key for the webhooks secret. This field is
required if secret_encryption_enabled is true or None.The encryption key must be a 32-character alphanumeric string, which
includes both uppercase and lowercase Latin letters (A-Z, a-z) and
digits (0-9).Examplelet config = WebhooksConfig { … secret_encryption_enabled: Some(true), encryption_key: Some(String::from(“aBcDeFgHiJkLmNoPqRsTuVwXyZ012345”)) }; ``` |