7 posts tagged with "kafka"

Kafka related topics and usage

View All Tags

Spice v2.0-rc.5 (May 27, 2026)

May 27, 2026 · 30 min read

Jack Eadie

Token Plumber at Spice AI

Spice v2.0-rc.5 is now available! 🔥

v2.0.0-rc.5 is the fifth release candidate for advanced testing of v2.0, building on v2.0.0-rc.4.

This release completes the mTLS implementation across server endpoints and outbound connectors, adds MongoDB Change Streams and durable Kafka offset persistence as new CDC sources, expands DML write-back to PostgreSQL, Snowflake, and Arrow, promotes DuckLake to Beta, introduces user-defined functions, on-demand dataset loading, unified query cancellation, dynamic HTTP request headers and subquery-driven request parameters, provider-aware LLM prompt caching, and a long list of Cayenne performance improvements.

Highlights in this release candidate include:

Spice Cayenne — CDC throughput, compaction and scan caching, synchronized partition commits, join filter propagation, parallel Vortex writes, lock-free deletion caches
Mutual TLS (mTLS) — TLS cert hot-reload, public mTLS for HTTP and Flight (channel + identity modes), mTLS client certs for FlightSQL and Spice.ai connectors
MongoDB Change Streams — native real-time CDC for MongoDB, no Debezium or Kafka required
Kafka CDC offsets — offsets persisted in sidecar tables for durable, resumable Kafka CDC
PostgreSQL DML — INSERT, UPDATE, DELETE write-back on PostgreSQL datasets
Snowflake DML — INSERT, UPDATE, DELETE write-back on Snowflake datasets
Arrow Primary Key Upserts — native upsert path using primary key matching
DuckLake promoted to Beta — with INSERT support on catalog tables
User-Defined Functions — define SQL UDFs in spicepods, plus remote UDFs over HTTP (Spice.ai Enterprise)
Spatial SQL UDFs — optional geospatial UDFs (ST_*) for geometry workloads
On-Demand Dataset Loading — datasets can be deferred and loaded on first reference
Unified Query Cancellation — Ctrl-C and HTTP request cancellation propagate across all execution paths
Dynamic HTTP Connector — pass-through request headers, subquery-driven params, and JSON schema decomposition
HTTP Rate-Control persistence — rate-limit state persisted in object storage across restarts
refresh_mode: snapshot — point-in-time snapshot acceleration with SQLite/Turso WAL flushing
Storage-profile accelerator tuning — accelerators auto-tune defaults based on local SSD, EBS-class disk, or tmpfs
Provider-Aware LLM Prompt Caching — automatic prompt caching for OpenAI-compatible providers that support it
Responses API — support across all model providers with streaming response.output_text.delta, plus Authorization: Bearer header support

What's New in v2.0.0-rc.5

Cayenne Improvements

Significant performance work across Spice Cayenne-backed catalogs and accelerators.

Ingest throughput: End-to-end improvements to CDC ingest, background compaction, and a new scan-result cache for hot reads; parallel Vortex partition writes; lock-free deletion caches with bloom-prefiltered probes; background retention with CDC pipelining; SQLite metastore pool scaled to 32 for high-concurrency mutation workloads.
Data inlining: Small writes are serialized as Arrow IPC and committed directly into the Cayenne metastore (cayenne_inlined_data), bypassing the staged Vortex write path for low-latency ingest. Inline upserts atomically rewrite existing inline rows instead of emitting side delete markers, and inline data remains query-visible via an in-memory union scan with a generation-keyed decode cache. Inline rows are checkpointed to Vortex when row, segment, or byte thresholds are reached. Defaults are refresh-mode aware: inline writes are enabled by default for high-frequency caching, changes, and fast append workloads and disabled for full, snapshot, and slower append.
Query planning: Join filter propagation across equi-join keys (gated behind runtime.params.cayenne_filter_propagation), range fallback for large join filters, hot-path clone elimination, and IN-list rewrites for large filter lists.
Correctness: Synchronized partition commits across partitions, correct NULL-sentinel handling for nullable partition expressions (e.g. bucket(N, col)), Vortex panic fix on highly compressible data, and live reads through expired protected snapshots.
Catalog and platform: Refresh-mode-aware compaction defaults, rejection of non-distributed Cayenne catalog configurations, and a vendored Vortex DataFusion integration for faster iteration on the Cayenne planner.

Mutual TLS (mTLS)

Spice.ai Enterprise feature. See Enterprise Security.

Spice now supports full mutual TLS for both HTTP and Arrow Flight endpoints.

TLS cert hot-reload (#10727): The Spice runtime watches for SIGHUP and reloads TLS certificates without restarting, enabling cert rotation with zero downtime.

Public mTLS for HTTP and Flight (#10753): Two client_auth_mode values control how the server handles client certificates:

request — optional mTLS: the server requests a client cert but accepts connections without one (useful for migration windows).
required — strict mTLS: the server requires a valid client cert signed by the configured CA.

mTLS client certs for FlightSQL and Spice.ai connectors (#10764): Outbound connections from the FlightSQL and Spice.ai data connectors can now present client certificates for mutual authentication with upstream services.

Example configuration:

runtime:
  tls:
    enabled: true
    certificate_file: /etc/spice/tls/server.crt
    key_file: /etc/spice/tls/server.key
    client_auth_mode: required
    client_auth_ca_file: /etc/spice/tls/client-ca.crt

MongoDB Change Streams

MongoDB datasets configured with refresh_mode: changes now stream changes from MongoDB Change Streams into any local accelerator (#10813), providing real-time CDC without Debezium or Kafka.

Example configuration:

datasets:
  - from: mongodb:my_collection
    name: my_collection
    params:
      host: my-cluster.mongodb.net
      db: mydb
    acceleration:
      enabled: true
      engine: duckdb
      refresh_mode: changes

CDC Improvements

See Change Data Capture (CDC) for an overview of CDC in Spice.

Kafka CDC offset persistence (#10823): Kafka CDC offsets are persisted in sidecar tables for durable, resumable streams. On restart or failover, Spice resumes from the last committed offset.
Pipelined CDC ingestion (#10676): Source reads overlap with batch apply, with additional batching, envelope coalescing, and nullability propagation improvements across the apply pipeline.
Debezium schema evolution fix (#10144): Schema changes in Debezium-sourced datasets no longer break dataset initialization on reload (fixes #9782).

PostgreSQL DML Support

The PostgreSQL data connector now supports write-back via INSERT, UPDATE, and DELETE operations (#10446). Combined with the existing read-side federation, PostgreSQL-backed datasets can serve as full read/write tables. The PostgreSQL Catalog connector additionally exposes foreign-key metadata for NSQL and query planning (#10849).

Snowflake DML Support

The Snowflake data connector now supports write-back via INSERT, UPDATE, and DELETE operations (#10747), complementing its existing read capabilities.

Arrow Primary Key Upserts

Arrow-accelerated tables now support native upsert operations using primary key matching (#10749), providing efficient update-or-insert semantics for in-memory datasets.

DuckLake Promoted to Beta

The DuckLake Catalog and Data Connector are promoted to Beta quality (#10743).

DuckLake catalog tables with read_write access now support INSERT operations (#10744), enabling full read/write workflows against DuckLake-backed catalogs. The DuckLake connector also gains a series of correctness fixes for downcast, module registration, schema discovery, and S3 credentials (#10650).

User-Defined Functions

Spice now supports user-defined functions (UDFs) as a first-class spicepod component (#10571), letting you define reusable SQL functions in the spicepod or invoke remote functions over HTTP. The runtime also gains table user functions with HTTP server gating (#10675).

A security fix closes a remote-UDF SSRF vector (#10757).

Spatial SQL UDFs

Spice now ships an optional set of geospatial SQL UDFs (ST_*) for geometry workloads (#10833). The functions are gated behind a build feature and can be invoked from any SQL surface.

On-Demand Dataset Loading

Datasets can now be marked for on-demand loading (#10629). Deferred datasets are registered with a declared schema at startup (#10669) and only fully resolve when first referenced, reducing startup time and memory footprint for spicepods with many seldom-used datasets.

Spicepods also gain columns[].type and columns[].nullable (#10661) with a lenient type parser for declaring schemas inline.

Unified Query Cancellation

All query execution paths — HTTP, Flight, FlightSQL, MCP, and internal — now honour a unified cancellation signal (#10390). When a client disconnects, presses Ctrl-C in the REPL, or cancels an in-flight HTTP request, the corresponding query is cancelled end-to-end, freeing resources promptly.

Dynamic HTTP Connector

The HTTP data connector gains dynamic request headers parameterised from query predicates (#10604), subquery-driven request parameters for fan-out queries (#10636), HTTP response metadata as queryable columns via JSON schema decomposition (#10679), no-limit pagination (#10673), and shared rate-control across HTTP-based connectors using the same backend host (#10648).

HTTP Rate-Control Persistence

The HTTP rate-control state (per-endpoint throttle counters) is now persisted in object storage (#10697), ensuring rate limits survive restarts and are consistent across replicas. Rate-control metrics now use an origin label rather than the connector name for cleaner aggregation (#10689).

The metrics HTTP endpoint (/metrics) is also independently rate-limited (#10162) to prevent scraping from impacting query serving.

`refresh_mode: snapshot`

Spice.ai Enterprise feature. See Acceleration Snapshots.

A new refresh_mode: snapshot provides point-in-time snapshot acceleration (#10651), with SQLite and Turso WAL flushing and a Cayenne metastore slice integration so accelerated readers see a consistent snapshot while writes continue.

Storage-Profile Accelerator Tuning

Acceleration configs gain a new storage_profile field (#10913) with values auto (default), local_ssd, ebs, and tmpfs. Under auto, the runtime detects whether the acceleration store is backed by local SSD, EBS-class network disk, or tmpfs, and applies storage-aware defaults across DuckDB, partitioned DuckDB, SQLite, Turso, and Cayenne file-mode accelerators. Explicit per-accelerator parameters always override the profile defaults.

Provider-Aware LLM Prompt Caching

LLM calls automatically use provider-aware prompt caching (#10645) when the configured model provider supports it (e.g., Anthropic, OpenAI). System prompts and tool descriptions are marked for caching so repeated invocations within the cache window reuse the provider-side cached prefix, reducing latency and cost.

A new searchable registry mode for LLM tools (#10647) lets agents discover tools by semantic search rather than enumerating all tools in the system prompt, which scales to large tool inventories.

Responses API Improvements

The Responses API is now supported across all configured model providers (#10724). Streaming delta events via response.output_text.delta are also supported (#10828). The runtime now also accepts Authorization: Bearer headers in addition to x-api-key, bumps async-openai, and stops populating FunctionToolCall.id so OpenAI-compatible servers can assign the ID themselves (#10911).

Distributed Cluster Improvements

Spice.ai Enterprise feature. See High Availability.

Per-request executor readiness gate (#10860): /v1/ready on schedulers waits for a configurable quorum of executors before returning healthy, enabling proper rolling deployments.
Ballista S3 shuffle reads under cluster mode (#10910): The shuffle reader builds its S3 client from the executor pod's environment, matching the writer. Async queries with runtime.params.shuffle_location: s3://... now complete instead of failing with AccessDenied on shuffle fetches.
Flattened scheduler config (#10450): runtime.scheduler.partition_management.* fields are flattened directly onto runtime.scheduler and renamed under the canonical "partition assignment" terminology. See Breaking Changes.

Caching & Search

Improvements across Caching and Search:

Per-principal cache namespacing (#10702): SQL, search, and caching-accelerator caches are now namespaced per authenticated principal, so cached results never cross identity boundaries.
DuckDB HNSW vector indexes (#10695, #10674, #10668): DuckDB-accelerated views support HNSW vector indexes for vector search, vector search SQL is rewritten to activate HNSW_INDEX_SCAN, and HNSW indexes are preserved across data refresh.

Security Improvements

See Authentication and TLS for configuring Spice security.

API key timing-position leak and remote-UDF SSRF (#10757): Closed a timing-based position-disclosure leak in API key comparison and blocked SSRF via remote UDF endpoint parameters.
Configurable allowed_hosts for MCP (#10638): MCP servers can be restricted to an explicit allowlist of upstream hosts.

SQL, Query, and Developer Experience

See the SQL Reference for the full SQL surface area.

SQL REPL expanded view (#10797): Toggle \x in the REPL for a vertical key-value layout on wide result sets.
FlightSQL Substrait plan support (#10761): The Spice runtime now implements CommandStatementSubstraitPlan, enabling clients that submit plans as Substrait-encoded protobuf.
MCP auth for streamable HTTP tools (#10927): Streamable HTTP MCP tools support native authentication via mcp_auth_token and mcp_headers, both with full Spice secret expansion.
Elasticsearch FTS engine config and index lifecycle (#10672): Direct FTS engine configuration plus index lifecycle and ingestion controls for the Elasticsearch connector.
Self-hosted Spice connector (#10546): Connect Spice to another self-hosted Spice runtime as a federated source.

Connector Bug Fixes

Notable correctness fixes across the Data Connectors: DynamoDB Streams retry on transient errors (#10794) and typed-NULL handling in DML (#10511); ScyllaDB physical filter pushdown disabled to fix incorrect results (#10772); MSSQL TOP N pushdown for non-nullable sort columns (#10621); DuckLake include filter applied (#10738); DuckDB DELETE/UPDATE on full and caching refresh modes (#10632); checked arithmetic for Turso integer-millis and timestamp-to-nanosecond conversions (#10786, #10666); and Flight GetFlightInfo/DoGet schema parity (#10864). See the Changelog for the full list.

Dependency Updates

Dependency / Component	Version
DuckDB	v1.5.2
Iceberg	v0.9.1
Turso	v0.6.0
Vortex	v0.69.0

Contributors

Breaking Changes

Flattened runtime.scheduler configuration (#10450): The nested runtime.scheduler.partition_management block has been flattened and renamed to use the canonical "partition assignment" terminology. Migrate as follows:

# Before
runtime:
  scheduler:
    partition_management:
      interval: 30s
      max_assignments_per_cycle: 16
      discovery_timeout: 10s

# After
runtime:
  scheduler:
    partition_assignment_interval: 30s
    max_assignments_per_interval: 16
    partition_discovery_timeout: 10s

Cookbook Updates

No new cookbook recipes.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v2.0.0-rc.5, use one of the following methods:

CLI:

spice upgrade v2.0.0-rc.5

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.0-rc.5 image:

docker pull spiceai/spiceai:2.0.0-rc.5

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0-rc.5

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changed

Changelog

Enable DML support for PostgreSQL data connector by @phillipleblanc in #10446
feat(postgres): support inline PEM sslrootcert by @claudespice in #10578
Add foreign key metadata discovery to PostgreSQL Catalog by @sgrebnov in #10849
Add Snowflake DML support by @lukekim in #10747
Add MongoDB Change Streams support by @lukekim in #10813
Add user-defined functions by @lukekim in #10571
Add table user functions and gate HTTP servers by @lukekim in #10675
feat: add on-demand dataset loading by @phillipleblanc in #10629
feat(runtime): declared-schema deferred datasets by @phillipleblanc in #10669
feat(spicepod, runtime): add columns[].type / nullable + lenient type parser by @phillipleblanc in #10661
Replace external smb crate with internal SMB 3.1.1 client by @phillipleblanc in #10516
Add unified query cancellation across all paths by @lukekim in #10390
Add dynamic HTTP request headers by @lukekim in #10604
feat(http): Support dynamic HTTP connector request params from subqueries by @lukekim in #10636
feat(http): pass through HTTP metadata columns with JSON schema decomposition by @lukekim in #10679
Add nolimit HTTP pagination max pages by @lukekim in #10673
Add shared HTTP rate control for connectors by @lukekim in #10648
Use origin label instead of name for HTTP rate control metrics by @lukekim in #10689
fix(http): reject OR across different HTTP filter columns by @lukekim in #10625
Add provider-aware LLM prompt caching by @lukekim in #10645
Add searchable registry mode for LLM tools by @lukekim in #10647
feat: refresh_mode: snapshot + SQLite/Turso WAL flush + Cayenne metastore slice by @phillipleblanc in #10651
feat: per-principal cache namespacing for SQL/search/caching-accelerator by @lukekim in #10702
Add self-hosted Spice connector support by @phillipleblanc in #10546
Add Delta Lake Azure tenant parameter by @phillipleblanc in #10671
Support OAuth2 client credentials in 'spice cloud login' by @ewgenius in #10586
Add configurable allowed_hosts for MCP by @lukekim in #10638
fix: make Helm chart probes configurable by @peasee in #10696
Strip high-cardinality datasets dim from anonymous telemetry by @lukekim in #10711
feat(elasticsearch): direct FTS engine config + index lifecycle and ingestion controls by @lukekim in #10672
Add DuckDB HNSW vector index support for accelerated views by @sgrebnov in #10695
Rewrite DuckDB vector search SQL to activate HNSW_INDEX_SCAN by @sgrebnov in #10674
Fix DuckDB HNSW vector indexes lost after data refresh by @sgrebnov in #10668
Fix DuckDB DELETE/UPDATE on full and caching refresh mode datasets by @phillipleblanc in #10632
Fix DuckLake connector: downcast, module registration, schema discovery, and S3 credentials by @sgrebnov in #10650
Fix federation pushing denied functions inside subqueries to remote engines by @phillipleblanc in #10692
fix(caching): honour refresh_on_startup: always in caching mode by @phillipleblanc in #10594
fix(iceberg): rebuild storage factory when Hadoop catalog scheme is inferred by @sgrebnov in #10601
Pipeline CDC ingestion: overlap source reads with batch apply by @lukekim in #10676
fix: add NULL check to CDC primary key extraction by @lukekim in #10684
Properly handle nullability during CDC processing by @krinart in #10803
Flatten scheduler config and rename partition management → partition assignment by @lukekim in #10450
Improve NSQL UX and harden internal LLM tools by @lukekim in #10715
Support Responses API across model providers by @lukekim in #10724
Update xAI default model and handle Grok model retirements by @Jeadie in #10723
Improve cli table layout by @krinart in #10725
TLS cert hot-reload (mTLS plan M1) by @phillipleblanc in #10727
Fix DuckLake catalog include filter being ignored by @phillipleblanc in #10738
Promote DuckLake Catalog and Data Connector to Beta quality by @sgrebnov in #10743
feat(ducklake): Support INSERT on catalog tables with read_write access by @sgrebnov in #10744
perf(cdc): coalesce envelopes and overlap commits in apply pipeline by @lukekim in #10745
feat: Allow full version tags in spicepod version by @peasee in #10748
Add Arrow primary key upserts by @lukekim in #10749
fix(snapshot): keep refresh_mode snapshot read-only by @phillipleblanc in #10752
feat(tls): public mTLS for HTTP and Flight (channel + identity modes) by @phillipleblanc in #10753
perf(cayenne): lock-free deletion caches with bloom-prefiltered probe by @lukekim in #10756
fix(security): close API key timing-position leak and remote-UDF SSRF by @lukekim in #10757
Fix 'wait_until_dependent_tables_are_ready' for catalogs by @phillipleblanc in #10758
Fixes for views and resolved tables on 'spice refresh' CLI by @phillipleblanc in #10759
Implement FlightSQL CommandStatementSubstraitPlan support by @lukekim in #10761
feat(connectors): mTLS client cert support for flightsql and spiceai connectors by @phillipleblanc in #10764
Allow arbitrary filenames when specifying spicepod path + kind validation by @krinart in #10777
fix: ignore field metadata in schema compatibility check in index_table_scan by @Jeadie in #10778
Display pushed-down limits in EXPLAIN TREE output by @lukekim in #10779
fix: enable streaming append for Kafka with Cayenne accelerator by @lukekim in #10780
fix: bound chunked-index intermediate batch size to prevent OOM by @phillipleblanc in #10783
fix: label all columns in spice cloud metrics table output by @claudespice in #10784
fix: use checked arithmetic for Turso integer-millis timestamp read path by @claudespice in #10786
fix: use checked arithmetic in timestamp-to-nanosecond conversions by @claudespice in #10666
Upgrade to DuckDB v1.5.2 by @sgrebnov in #10788
Improve CDC ingestion performance by @lukekim in #10789
Fix tool_search/tool_invoke spans by @lukekim in #10791
Add Cayenne inline mutations and benchmark coverage by @lukekim in #10792
Ensure we always resolve table names in distributed mode/metadata by @Jeadie in #10793
Remove permanent errors from DynamoDB Streams by @krinart in #10794
Add expanded view mode for wide table display in SQL REPL by @lukekim in #10797
Fix Cayenne CDC schema mismatch error by @sgrebnov in #10800
Executors should create catalog tables on join by @Jeadie in #10807
Add compressed file support for listing connectors by @lukekim in #10809
Improve Cayenne mutation, scan, and inline memtable scaling by @lukekim in #10811
Add range fallback for large join filters by @lukekim in #10816
Improve Cayenne join filter pushdown by @lukekim in #10818
Synchronize Cayenne partition commits across partitions by @phillipleblanc in #10819
fix: Deny nondistributed cayenne catalog by @peasee in #10821
Enable parallel Cayenne Vortex writes by @lukekim in #10822
Expand Arrow type handling in formatting and Elasticsearch by @lukekim in #10825
Add response.output_text.delta to responses API by @krinart in #10828
feat(cayenne): add join filter propagation and no-spill Q21 planning by @lukekim in #10840
Upgrade Turso to v0.6.0 by @sgrebnov in #10843
feat(cli): add spice feedback command to open community Slack by @lukekim in #10856
Upgrade iceberg to v0.9.1 by @sgrebnov in #10859
feat(cluster): per-request executor readiness gate on /v1/ready by @phillipleblanc in #10860
fix: Require dim-side statistics for CayennePropagateFilterAcrossEquiJoinKeys by @sgrebnov in #10863
fix: Debezium schema evolution breaks dataset init on reload by @claudespice in #10144
fix(mssql): Push topK limit to SQL Server for non-nullable sort columns by @Jeadie in #10621
fix(ScyllaDB): disable physical filter pushdown by @sgrebnov in #10772
fix: handle typed NULLs and prevent overflow in DynamoDB DML type conversions by @krinart in #10511
fix: use InsertOp::Overwrite in DynamoDB bootstrap scan_and_overwrite_accelerator by @krinart in #10639
Improve DynamoDB Bootstrap performance by @krinart in #10616
fix: preserve field and schema metadata in Vortex type transformation by @lukekim in #10628
fix: GH connector - explicitly use AWS LC RS crypto provider for jwt by @phillipleblanc in #10619
fix: add snapshot mode guards to delete_from/update and delegate DML in SwappableTableProvider by @phillipleblanc in #10685
Persist HTTP rate-control state in object storage by @lukekim in #10697
Rate limit metrics HTTP endpoint by @lukekim in #10162
feat(geo): add optional spatial SQL UDF support by @lukekim in #10833
feat(cayenne): CDC throughput, compaction, scan caching, and benchmarks by @lukekim in #10852
fix(cayenne): fix Vortex panic on highly compressible data by @sgrebnov in #10855
fix(cayenne): Read live protected snapshots after cleanup grace period by @sgrebnov in #10901
fix: Disable Cayenne HashJoin rewriter optimizer by @sgrebnov in #10882
Fix GetFlightInfo vs DoGet Flight Schema by @krinart in #10864
fix(search): preserve column casing in /v1/search primary key plumbing by @claudespice in #10909
fix(object-store): dedupe s3 url style auto-detection log by @phillipleblanc in #10898
Improve Spice CLI manifest editing and direct command modes by @lukekim in #10815
Persist Kafka CDC offsets in sidecar tables by @lukekim in #10823
feat(task-history): record Ballista stages for distributed queries by @phillipleblanc in #10831
Add '#[deny(clippy::missing_trait_methods)]' to wrapper/delegation trait impls by @Jeadie in #10795
Optimize Cayenne catalog maintenance paths by @lukekim in #10904
Centralize DuckDB settings for accelerator by @ewgenius in #10895
deps(ballista): bump to 47e2b494 to fix S3 shuffle reads under cluster mode by @phillipleblanc in #10910
Authorization header + Bump async-openai + responses_adapter fix by @krinart in #10911
Tune accelerators by storage profile by @lukekim in #10913
feat: add dataset-level on_schema_change config by @lukekim in #10908
Handle NULL sentinel for nullable partition expressions by @Jeadie in #10880
fix: Remove Cayenne Catalog from catalog registration by @peasee in #10914
Add catalog name to foreign key metadata in postgres catalog by @Jeadie in #10917
Cayenne perf: eliminate redundant clones, PK point-lookup fanout fix, IN-list rewrite + microbench coverage by @lukekim in #10916
fix(turso-shared): retry on Turso BEGIN CONCURRENT "Write-write conflict" by @lukekim in #10946
Vendor Vortex DataFusion for Cayenne by @lukekim in #10933
perf(cayenne): background retention + enable CDC pipelining for retention-configured tables by @lukekim in #10936
feat(cayenne): scale metastore pool to 32 + vs_duckdb_scaling benches (1→128 concurrency, sqlite + turso lanes) by @lukekim in #10943
feat(mcp): support auth for streamable HTTP tools by @phillipleblanc in #10927
Explicit error if v1/search requests a table without search index by @Jeadie in #10968
Fix spicepod loading failure when directory name contains dots by @sgrebnov in #10958
Extend append tests with arrow engine configurations by @sgrebnov in #10959
Remove dataset on_schema_change Policy from rc.5 release notes by @sgrebnov in #10964
Skip tpcds_q78 for Cayenne engine at SF100 by @sgrebnov in #10966
fix: Update benchmark snapshots May-20 by @app/github-actions in #10952
Fix #10951: UdtfExec invariant Vec lengths must match children count by @phillipleblanc in #10953
docs(release): update v2.0.0-rc.5 notes with latest trunk PRs by @lukekim in #10949
Remove eval related things for v2.0.0 by @Jeadie in #10945
build(deps): bump ubuntu from 24.04 to 26.04 in the docker-dependencies group by @app/dependabot in #10883
fix: Add publish = false to chbench-driver by @sgrebnov in #10939

Full Changelog: https://github.com/spiceai/spiceai/compare/v2.0.0-rc.4...v2.0.0-rc.5

Spice v2.0-rc.3 (Apr 21, 2026)

April 21, 2026 · 13 min read

Evgenii Khramkov

Senior Software Engineer at Spice AI

Announcing the release of Spice v2.0-rc.3! ⚡

v2.0.0-rc.3 is the third release candidate for advanced testing of v2.0, building on v2.0.0-rc.2.

Highlights in this release candidate include:

HTTP Connector Enhancements with OAuth2 refresh-token authentication, query-parameter pagination, and map-to-array conversion for broader API compatibility
Databricks and Unity Catalog Reliability Improvements with resilience controls, improved UC-awareness, permission checks, and structured error reporting
Snowflake and ADBC Registration Performance Improvements with better observability during dataset registration
OpenTelemetry Exporter Improvements with exporter fixes and support for authenticated metrics export
Kafka, GitHub, and HTTP Connector Fixes including Kafka reliability improvements, GitHub GraphQL resilience updates, and HTTP JSON union/reload fixes

What's New in v2.0.0-rc.3

HTTP Connector Enhancements

The HTTP connector now supports more authentication and API response patterns, making it easier to integrate with modern REST APIs.

Key improvements:

OAuth2 Refresh-Token Authentication: Added support for OAuth2 refresh-token flows for APIs that issue short-lived access tokens.
Query-Parameter Pagination: Added pagination support using query parameters for APIs that expose page or cursor controls in the URL.
Map-to-Array Conversion: Added response transformation support for APIs that return map-shaped payloads that need to be normalized into arrays.
Improved JSON Union Handling: Better handling for heterogeneous JSON payloads during ingestion.
More Reliable Reloads: Fixed runtime behavior for HTTP-backed datasets during spicepod reloads.

Example configuration of an HTTP connector using the OAuth2 refresh token flow:

datasets:
  - from: https://api.example.com
    name: secure_data
    params:
      file_format: json
      allowed_request_paths: '/v1/**'
      auth_token_url: https://auth.example.com/oauth/token
      http_auth_refresh_token: ${secrets:my_refresh_token}
      http_auth_client_id: ${secrets:my_client_id}
      http_auth_client_secret: ${secrets:my_client_secret}
      auth_scopes: 'read:data offline_access'

Databricks and Unity Catalog Reliability Improvements

Databricks and Unity Catalog integrations are now more resilient and provide clearer behavior in permission-constrained environments.

Key improvements:

Resilience Controls: Added controls to improve reliability when interacting with Databricks services.
Unity Catalog Awareness: Improved handling for Unity Catalog-specific behaviors and mixed deployment configurations.
Permission Prechecks: Databricks UC permission checks now distinguish explicit denials from ambiguous cases.
Structured Error Reporting: Advisory permission failures now surface with more actionable structured errors.
Classic SQL Warehouse Compatibility: Improved handling for foreign table scenarios with Classic SQL Warehouse combinations.
Task History Instrumentation: Added instrumentation to improve observability for Databricks-related operations.

Snowflake and ADBC Improvements

Snowflake and ADBC-backed dataset registration is now faster and easier to observe.

Key improvements:

Faster Dataset Registration: Improved registration performance for Snowflake and ADBC datasets.
Better Observability: Added better instrumentation and visibility into registration workflows.
ADBC Alignment: Updated ADBC dependencies and integration points for improved compatibility.
Search Schema Fix: Fixed a full-text search schema mismatch issue with the ADBC connector.

OpenTelemetry and Observability Improvements

Spice improves telemetry export reliability and authenticated metrics delivery for the OpenTelemetry integration.

Key improvements:

OTEL Exporter Fixes: Fixed issues in the OpenTelemetry exporter.
Authenticated Metrics Export: Added support for authorization headers in the OTEL metrics exporter.
Reduced Startup Noise: Suppressed unnecessary AWS SDK noise and improved OTEL-related initialization behavior.
Connector Initialization Reliability: Fixed issues that could block connector initialization in telemetry-related code paths.

Dependency and Toolchain Updates

Dependency / Component	Version / Update
Rust toolchain	v1.94.1 (from v1.93.1)
DataFusion	v52.5.0-rc1
mistral.rs / candle	`mistral.rs` v0.8.x, `candle` v0.10.1
ADBC Core	v0.23

Other Improvements

Improved Query Pushdown: Expanded sort and limit pushdown, including improved pushdown behavior for Oracle and MSSQL connectors.
Partitioned Query Planning Improvements: Improved PartitionedTableScanRewrite handling for ORDER BY, partition expressions, and fully qualified table references, while preventing incorrect bucketing partition pushdown to executors.
MongoDB SRV Support: Upgraded datafusion-table-providers with MongoDB SRV support.
Tantivy Logging: Search logging now defaults to warn unless very verbose logging is enabled.
Kafka Connector Fixes: Improved reliability for the Kafka connector behavior.
GitHub Connector Resilience: Improved commit fetching for dynamic and slash refs, and reduced GraphQL page sizes on gateway errors for the GitHub connector.
GitHub API Efficiency: Lowered default comment fetch sizes to reduce pressure on GitHub GraphQL APIs.
Embedding Validation: Added validation for embedding row_id columns during dataset initialization.
View Cache Invalidation: Cached plans are now cleared when views are updated.
Refresh SQL Dedup Fix: Fixed append refresh deduplication when refresh_sql selects a subset of columns.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No new cookbook recipes.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v2.0.0-rc.3, use one of the following methods:

CLI:

spice upgrade v2.0.0-rc.3

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.0-rc.3 image:

docker pull spiceai/spiceai:2.0.0-rc.3

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0-rc.3

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changed

Changelog

fix: Full Text Search schema mismatch with ADBC connector by @lukekim in #10235
docs: Update v2.0.0-rc.2 release notes with latest changes by @lukekim in #10238
Fix append refresh dedup failure when refresh_sql selects column subset by @sgrebnov in #10225
Revert "Properly mark dataset as Ready on Scheduler (#10215)" by @sgrebnov in #10242
Fix failing merge conflicts for benchmarks by @krinart in #10247
fix(github): fetch commits for dynamic and slash refs by @lukekim in #10233
Upgrade DataFusion to v52.5.0-rc1 by @lukekim in #10249
Merge develop to trunk (2026-04-09) by @claudespice in #10248
fix: Validate embedding row_id columns during dataset init (fixes #8226) by @claudespice in #10208
fix: Update tpch benchmark snapshots for federated/glue[csv].yaml by @app/github-actions in #10244
feat(databricks): add resilience controls, UC awareness, and task history instrumentation by @lukekim in #10246
fix: Make PartitionManager resilient to bare vs fully qualified table references by @sgrebnov in #10257
fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-cayenne[file].yaml by @app/github-actions in #10256
Merge develop to trunk (2026-04-10) by @claudespice in #10251
Improve Snowflake/ADBC dataset registration performance and observability by @lukekim in #10266
Fixes for kafka connector by @krinart in #10263
fix(runtime): gate otel code tags, suppress aws sdk noise, and unblock connector init by @lukekim in #10260
fix(runtime): avoid regionless AWS SDK loads by @lukekim in #10271
Add versioned release install workflow coverage by @lukekim in #10276
fix(runtime): handle HTTP JSON unions and spicepod reloads by @lukekim in #10277
Databricks UC permission prechecks: explicit denial as permanent error, ambiguous cases advisory by @lukekim in #10274
Revert component status changes re-introduced by develop merge (#10248) by @sgrebnov in #10293
Fix broken CI workflows by @ewgenius in #10294
Group dependabot updates by ecosystem by @lukekim in #10296
fix(tests): Replace flaky S3 Vectors snapshot tests with structural validation by @lukekim in #10301
Update test_github_workflows snapshot by @lukekim in #10304
fix(ci): fix Bedrock runner mismatch and snapshot auto-merge failure by @ewgenius in #10306
feat(http): Add map-to-array conversion and query-parameter pagination by @lukekim in #10295
New crate: datafusion-ddl by @Jeadie in #10205
Make Databricks UC permission checks advisory with structured error reporting by @lukekim in #10283
build(deps): bump the github-actions-dependencies group with 4 updates by @app/dependabot in #10298
fix: Clear cached plans on view updates by @peasee in #10312
build(deps): bump the aws-sdk group with 7 updates by @app/dependabot in #10299
Code out of runtime. by @Jeadie in #10178
fix: Respect function registry denies for accelerated table filter pushdown by @peasee in #10311
fix: Don't block heartbeat when all slots acquired by @peasee in #10322
fix: strip only outer parens in get_table_partition_expr_from_ctx by @Jeadie in #10323
Upgrade datafusion-table-providers with MongoDB SRV support by @lukekim in #10317
fix: Avoid pushing down bucketing partition expressions into executors by @peasee in #10324
Upgrade datafusion-table-providers to d1b911a5 and bump adbc to 0.23 by @lukekim in #10329
fix: Update Search integration test snapshots by @app/github-actions in #10308
Handle foreign table + Classic sql warehouse combination gracefully by @krinart in #10318
New crate datafusion-flightsql by @Jeadie in #10201
Set tantivy=warn unless very verbose logging by @Jeadie in #10338
Remove image registry and image name options from spidapter by @ewgenius in #10241
build(deps): bump sysinfo from 0.37.2 to 0.38.4 by @app/dependabot in #10291
build(deps): bump futures from 0.3.31 to 0.3.32 by @app/dependabot in #10289
New crate 'datafusion-dml' by @Jeadie in #10334
Jeadie/26 04 16/spice sql by @Jeadie in #10343
Add Teraswitch/Pittsburgh apt mirrors + retry config for CI runners by @lukekim in #10349
Implement sort pushdown and fix pushdown gaps across providers by @lukekim in #10337
Merge develop to trunk (2026-04-16) by @claudespice in #10345
Update candle and mistral.rs lock-step pins by @lukekim in #10278
docs: fix status badges in README by @lukekim in #10350
Migrate secrets to vars by @krinart in #10354
Add limit pushdown and improve sort pushdown for Oracle and MSSQL by @sgrebnov in #10351
Fix ubuntu mirror configuration by @ewgenius in #10359
fix: Increase throughput test default ready_wait from 30s to 300s (fixes #8207) by @claudespice in #10344
Add auth headers support to OTEL metrics exporter by @lukekim in #10347
fix(github): shrink GraphQL page size on gateway errors; lower comment defaults by @lukekim in #10355
Relax apt mirror substitution failure to warning in CI action by @ewgenius in #10361
feat(http): Add OAuth2 refresh-token auth to HTTP connector by @lukekim in #10348
Upgrade Rust toolchain to 1.94.1 by @lukekim in #10353
Handle order by and sort in PartitionedTableScanRewrite by @Jeadie in #9656
Fix OTEL Exporter by @krinart in #10363
Pin spiceai candle / TEI forks to merged revs; drop local [patch] overrides by @lukekim in #10362

Full Changelog: https://github.com/spiceai/spiceai/compare/v2.0.0-rc.2...v2.0.0-rc.3

Spice v1.10.4 (Jan 5, 2026)

January 5, 2026 · 2 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

Announcing the release of Spice v1.10.4! 🛠️

v1.10.4 is a patch release with fixes for Kafka/Debezium batch commits, ABFSS URL support for Azure Data Lake Storage Gen2, and improved column projection handling for location metadata columns.

What's New in v1.10.4

Additional Improvements & Bug Fixes

Reliability: Fixed Kafka and Debezium batch commit handling to properly commit offsets across all partitions. Previously, only the last message's offset was committed, which could cause message loss when batches contained messages from multiple partitions.
Reliability: Added support for abfss:// URL prefix for Azure Data Lake Storage Gen2, in addition to the existing abfs:// prefix. The abfss scheme indicates secure (TLS) connections to ADLS Gen2.
Reliability: Fixed column projection order mismatch when querying datasets with location metadata columns (e.g., SELECT location, day, size FROM dataset). Queries that specified columns in a different order than the schema would fail with "column types must match schema types" errors.
Developer Experience: Added detailed diagnostic logging for union projection pushdown optimization failures in cluster mode. When projection pushdown cannot be applied, debug-level logs now provide additional context to help identify the root cause.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No major cookbook updates.

The Spice Cookbook includes 84 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.10.4, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.10.4 image:

docker pull spiceai/spiceai:1.10.4

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Changelog

Update acknowledgements by @app/github-actions in #8695
Proper batch commit for kafka/debezium by @krinart in #8671
Add support for abfss by @krinart in #8706
cluster: UnionProjectionPushdownOptimizer: Add projection pushdown diagnostics for union children by @phillipleblanc in #8734
Fix column projection order mismatch with location metadata columns by @phillipleblanc in #8738

Spice v1.10.2 (Dec 22, 2025)

December 23, 2025 · 5 min read

Sergei Grebnov

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.10.2! 🔥

v1.10.2 introduces Tiered Caching Acceleration with Localpod for multi-layer acceleration architectures, Periodic Acceleration Snapshots with configurable intervals, DynamoDB JSON Nesting for column consolidation, and Kafka/Debezium Batching for faster data ingestion. This release also includes fixes for SQLite accelerator decimal/date handling and real-time status reporting for the /v1/datasets and /v1/models API endpoints.

What's New in v1.10.2

Tiered Caching with Localpod

Multi-Layer Acceleration Architecture: The Localpod connector now supports caching refresh mode, enabling tiered acceleration where a persistent cache (e.g., file-mode DuckDB) feeds a fast in-memory cache (e.g., Arrow, memory-mode DuckDB).

Key Features:

Automatic Cache Propagation: New cache entries automatically propagate from parent to child accelerators
Warm Startup: Child accelerators initialize from existing parent data on startup, eliminating cold-start latency
Flexible Tiering: Combine any accelerator engines (DuckDB, SQLite, Cayenne) across tiers

Example spicepod.yaml configuration:

datasets:
  # Parent: persistent file-mode cache
  - from: https://api.example.com
    name: api_cache
    acceleration:
      enabled: true
      refresh_mode: caching
      engine: duckdb
      mode: file

  # Child: fast in-memory cache fed by parent
  - from: localpod:api_cache
    name: api_cache_memory
    acceleration:
      enabled: true
      refresh_mode: caching
      engine: arrow
      mode: memory

For more details, refer to the Localpod Data Connector Documentation.

Periodic Acceleration Snapshots

Configurable Snapshot Intervals: A new snapshots_create_interval parameter enables periodic snapshot creation for accelerated datasets across all refresh modes. This provides better control over snapshot frequency and ensures consistent recovery points for accelerated data.

Example spicepod.yaml configuration:

datasets:
  - from: s3://my-bucket/data.parquet
    name: my_data
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      refresh_mode: caching
      snapshots: enabled
      params:
        snapshots_create_interval: 60s # Write a snapshot every 60 seconds

For more details, refer to the Data Acceleration Documentation.

DynamoDB JSON Nesting

Consolidate Columns into JSON: The DynamoDB Data Connector now supports consolidating columns into a single JSON column using the json_object: "*" metadata option. This is useful when only a few columns are needed as discrete fields while the rest can be accessed as nested JSON.

Example spicepod.yaml configuration:

datasets:
  - from: dynamodb:my_table
    name: my_table
    columns:
      - name: PK
      - name: SK
      - name: data_json
        metadata:
          json_object: '*' # Captures all other columns as JSON

Example Output: Given a DynamoDB table with columns PK, SK, name, email, and status, the resulting table schema consolidates all non-specified columns into the data_json column:

PK	SK	data_json
pk_1	sort_1	`{"name": "Alice", "email": "alice@example.com", "status": "active"}`
pk_2	sort_2	`{"name": "Bob", "email": "bob@example.com", "status": "inactive"}`

For more details, refer to the DynamoDB JSON Nesting Documentation.

Kafka/Debezium Batching

Faster Data Ingestion: Configure message batching for Kafka and Debezium connectors to improve data ingestion throughput. Batching reduces processing overhead by grouping multiple messages together before insertion.

Key Features:

Configurable Batch Size: Control the maximum number of records per batch (default: 10,000)
Configurable Batch Duration: Set the maximum wait time before flushing a partial batch (default: 1s)

Example spicepod.yaml configuration:

datasets:
  - from: debezium:kafka-server.public.my_table
    name: my_table
    params:
      batch_max_size: 10000 # Max records per batch (default: 10000)
      batch_max_duration: 1s # Max wait time per batch (default: 1s)

For more details, refer to the Kafka Data Connector Documentation and Debezium Data Connector Documentation.

Additional Improvements & Bug Fixes

Reliability: Fixed SQLite accelerator decimal and date type handling for improved data type accuracy.
Reliability: Fixed real-time status reporting for /v1/datasets and /v1/models API endpoints.
Reliability: Fixed Kafka warning when security.protocol is set to PLAINTEXT.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

New Cayenne Data Accelerator Recipe: New recipe demonstrating how to accelerate a local copy of the taxi trips dataset using Cayenne as the data accelerator engine. See Cayenne Data Accelerator Recipe for details.

New Dataset Partitioning Recipe: New recipe demonstrating how to partition accelerated datasets to improve query performance. See Dataset Partitioning for details.

The Spice Cookbook includes 84 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.10.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.10.2 image:

docker pull spiceai/spiceai:1.10.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Changelog

Fix kafka warning when security.protocol is set to PLAINTEXT by @krinart in #8587
fix: SQLite accelerator decimal/date handling by @phillipleblanc in #8606
feat: Enable localpod with caching mode accelerator for tiered caching by @phillipleblanc in #8621
Remove the clippy::too_many_lines lint by @phillipleblanc in #8549
Add snapshot interval for acceleration snapshots by @phillipleblanc in #8627
Json Nesting for DynamoDB by @krinart in #8623
Implement batching for Kafka/Debezium + null Decimal handling by @krinart in #8622
fix: Status field in /v1/datasets & /v1/models by @lukekim in #8633

Spice v1.6.1 (Sep 1, 2025)

September 2, 2025 · 3 min read

Jack Eadie

Token Plumber at Spice AI

Announcing the release of Spice v1.6.1! ⚡

Spice 1.6.1 is a patch release that provides improved Kafka type inference and JSON flattening support, alongside several bug fixes.

What's New in v1.6.1

Improved Kafka Type Inference: Improve Kafka type inference by configuring the number of Kafka messages sampled during schema inference. Increasing the sample size can improve the robustness and reliability of inferred schemas, especially in cases where data contains optional fields or varying structures.

Example spicepod.yml:

dataset:
  - from: kafka:orders_events
    name: orders
    params:
      schema_infer_max_records: 100 # Default 1.

For details, see the Kafka Data Connector Documentation.

Improved Kafka JSON Support: Enable nested JSON Kafka messages to be represented in flattened JSON format for the dataset schema.

Example spicepod.yml:

dataset:
  - from: kafka:orders_events
    name: orders
    params:
      flatten_json: true # default false

For example, the object:

{
  "order_id": "a1f2c3d4-1111-2222-3333-444455556666",
  "customer": {
    "id": 101,
    "name": "Alice",
    "premium": true,
    "contact": {
      "email": "alice@example.com",
      "phone": "555-1234"
    }
  },
  "discount": 5.0,
  "shipped": false
}

With flatten_json: true the result is:

+------------------------+-----------+-------------+
| column_name            | data_type | is_nullable |
+------------------------+-----------+-------------+
| order_id               | Utf8      | YES         |
| customer.id            | Int64     | YES         |
| customer.name          | Utf8      | YES         |
| customer.premium       | Boolean   | YES         |
| customer.contact.email | Utf8      | YES         |
| customer.contact.phone | Utf8      | YES         |
| discount               | Float64   | YES         |
| shipped                | Boolean   | YES         |
+------------------------+-----------+-------------+

With flatten_json: false or ommitted the result is:

+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
| column_name | data_type                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | is_nullable |
+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
| order_id    | Utf8                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | YES         |
| customer    | Struct([Field { name: "id", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "name", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "premium", data_type: Boolean, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "contact", data_type: Struct([Field { name: "email", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "phone", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]) | YES         |
| discount    | Float64                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | YES         |
| shipped     | Boolean                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | YES         |
+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+

For details, see the Kafka Data Connector Documentation.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No new cookbook recipes added in this release.

The Spice Cookbook includes 77 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.6.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.6.1 image:

docker pull spiceai/spiceai:1.6.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Changelog

Fix metadata field issue by @Advayp in #6957
Update datafusion and datafusion-table-providers crates (#6985) by @Jeadie in #6985
Add flatten_json param support for Kafka connector (#6976) by @sgrebnov in #6976
Add schema_inference_sample_count param support for Kafka connector (#6969) by @sgrebnov in #6969
Add integration test for Kafka connector (#6965) by @sgrebnov in #6965
Skip dataset health check for IcebergTableProvider datasets by @phillipleblanc in #6995

Spice v1.6.0 (Aug 26, 2025)

August 27, 2025 · 22 min read

Sergei Grebnov

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.6.0! 🔥

Spice 1.6.0 upgrades DataFusion to v48, reducing expressions memory footprint by ~50% for faster planning and lower memory usage, eliminating unnecessary projections in queries, optimizing string functions like ascii and character_length for up to 3x speedup, and accelerating unbounded aggregate window functions by 5.6x. The release adds Kafka and MongoDB connectors for real-time streaming and NoSQL data acceleration, supports OpenAI Responses API for advanced model interactions including OpenAI-hosted tools like web_search and code_interpreter, improves the OpenAI Embeddings Connector with usage tier configuration for higher throughput via increased concurrent requests, introduces Model2Vec embeddings for ultra-low-latency encoding, and improves the Amazon S3 Vectors engine to support multi-column primary keys.

What's New in v1.6.0

DataFusion v48 Highlights

Spice.ai is built on the DataFusion query engine. The v48 release brings:

Performance & Size Improvements 🚀: Expressions memory footprint was reduced by ~50% resulting in faster planning and lower memory usage, with planning times improved by 10-20%. There are now fewer unnecessary projections in queries. The string functions, ascii and character_length were optimized for improved performance, with character_length achieving up to 3x speedup. Queries with unbounded aggregate window functions have improved performance by 5.6 times via avoided unnecessary computation for constant results across partitions. The Expr struct size was reduced from 272 to 144 bytes.

New Features & Enhancements ✨: Support was added for ORDER BY ALL for easy ordering of all columns in a query.

See the Apache DataFusion 48.0.0 Blog for details.

Runtime Highlights

Amazon S3 Vectors Multi-Column Primary Keys: The Amazon S3 Vectors engine now supports datasets with multi-column primary keys. This enables vector indexes for datasets where more than one column forms the primary key, such as those splitting documents into chunks for retrieval contexts. For multi-column keys, Spice serializes the keys using arrow-json format, storing them as single string keys in the vector index.

Model2Vec Embeddings: Spice now supports model2vec static embeddings with a new model2vec embeddings provider, for sentence transformers up to 500x faster and 15x smaller, enabling scenarios requiring low latency and high-throughput encoding.

embeddings:
  - from: model2vec:minishlab/potion-base-8M # HuggingFace model
    name: potion
  - from: model2vec:path/to/my/local/model # local model
    name: local

Learn more in the Model2Dev Embeddings documentation.

Kafka Data Connector: Use from: kafka:<topic> to ingest data directly from Kafka topics for integration with existing Kafka-based event streaming infrastructure, providing real-time data acceleration and query without additional middleware.

Example Spicepod.yml:

- from: kafka:orders_events
  name: orders
  acceleration:
    enabled: true
    refresh_mode: append
  params:
    kafka_bootstrap_servers: server:9092

Learn more in the Kafka Data Connector documentation.

MongoDB Data Connector: Use from: mongodb:<dataset> to access and accelerate data stored in MongoDB, deployed on-premises or in the cloud.

Example spicepod.yml:

datasets:
  - from: mongodb:my_dataset
    name: my_dataset
    params:
      mongodb_host: localhost
      mongodb_db: my_database
      mongodb_user: my_user
      mongodb_pass: password

Learn more in the MongoDB Data Connector documentation.

OpenAI Responses API Support: The OpenAI Responses API (/v1/responses) is now supported, which is OpenAI's most advanced interface for generating model responses.

To enable the /v1/responses HTTP endpoint, set the responses_api parameter to enabled:

Example spicepod.yml:

models:
  - name: openai_model_using_responses_api
    from: openai:gpt-4.1
    params:
      openai_api_key: ${ secrets:OPENAI_API_KEY }
      responses_api: enabled # Enable the /v1/responses endpoint for this model

Example curl request:

curl http://localhost:8090/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "input": "Tell me a three sentence bedtime story about Spice AI."
  }'

To use responses in spice chat, use the --responses flag.

Example:

spice chat --responses # Use the `/v1/responses` endpoint for all completions instead of `/v1/chat/completions`

Use OpenAI-hosted tools supported by Open AI's Responses API by specifying the openai_responses_tools parameter:

Example spicepod.yml:

models:
  - name: test
    from: openai:gpt-4.1
    params:
      openai_api_key: ${ secrets:SPICE_OPENAI_API_KEY }
      tools: sql, list_datasets
      responses_api: enabled
      openai_responses_tools: web_search, code_interpreter #  'code_interpreter' or 'web_search'

These OpenAI-specific tools are only available from the /v1/responses endpoint. Any other tools specified via the tools parameter are available from both the /v1/chat/completions and /v1/responses endpoints.

Learn more in the OpenAI Model Provider documentation.

OpenAI Embeddings & Models Connectors Usage Tier: The OpenAI Embeddings and Models Connectors now supports specifying account usage tier for embeddings and model requests, improving the performance of generating text embeddings or calling models during dataset load and search by increasing concurrent requests.

Example spicepod.yml:

embeddings:
  - from: openai:text-embedding-3-small
    name: openai_embed
    params:
      openai_usage_tier: tier1

By setting the usage tier to the matching usage tier for your OpenAI account, the Embeddings and Models Connector will increase the maximum number of concurrent requests to match the specified tier.

Learn more in the OpenAI Model Provider documentation.

Contributors

New Contributors

@krinart made their first contribution in github.com/spiceai/spiceai/pull/6573

Breaking Changes

No breaking changes.

Cookbook Updates

Added OpenAI Responses API - Use OpenAI's Responses API with Spice
Added Live Orders Analytics with Apache Kafka Data Connector - Combine real-time data streaming from Kafka with other datasets
Added MongoDB Data Connector - Use MongoDB as a data source with Spice

The Spice Cookbook includes 77 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.6.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.6.0 image:

docker pull spiceai/spiceai:1.6.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is also now available in the AWS Marketplace!

What's Changed

Dependencies

DataFusion: Upgraded to v48
Rust: Upgraded from 1.86.0 to 1.87.0

Changelog

Support Streaming with Tool Calls (#6941) by @Advayp in #6941
Fix parameterized query planning in DataFusion (#6942) by @Jeadie in #6942
Update the UnableToLoadCredentials error with a pointer to docs (#6937) by @phillipleblanc in #6937
Fix spicecloud benchmark (#6935) by @krinart in #6935
[Debezium] Support for VariableScaleDecimal (#6934) by @krinart in #6934
Update to DF 48 (#6665) by @mach-kernel and @kczimm in #6665
Mark append-stream and CDC datasets as ready after first message (#6914) by @sgrebnov in #6914
Model2Vec embedding model support (#6846) by @mach-kernel in #6846
Update snapshot for S3 vector search test (#6920) by @Jeadie in #6920
remove [] from queryset in spicepod path for CI (#6919) by @Jeadie in #6919
Remove verbose tracing (#6915) by @Jeadie in #6915
Refactor how models supporting the Responses API are loaded (#6912) by @Advayp in #6912
Write tests for truncate formatting in arrow_tools and fix bug. (#6900) by @Jeadie in #6900
Support using the Responses API from spice chat (#6894) by @Advayp in #6894
Include GPT-5 into Text-To-SQL and Financebench benchmarks (#6907) by @sgrebnov in #6907
Better error message when credentials aren't loaded for S3 Vectors (#6910) by @phillipleblanc in #6910
Add tracing and system prompt support for the Responses API (#6893) by @Advayp in #6893
Constraint violation check is improved to control behavior when violations occur within a batch (#6897) by @phillipleblanc in #6897
fix: Multi-column text search with v1/search (#6905) by @peasee in #6905
fix: Correctly project text search primary keys to underlying projection (#6904) by @peasee in #6904
fix: Update benchmark snapshots (#6901) by @app/github-actions in #6901
In S3vector, do not pushdown on non-filterable columns (#6884) by @Jeadie in #6884
Run E2E Test CI macOS build on bigger runners (#6896) by @phillipleblanc in #6896
Enable configuration of the Responses API for the Azure model provider (#6891) by @Advayp in #6891
fix: Update benchmark snapshots (#6888) by @app/github-actions in #6888
Update OpenAPI specification for /v1/responses (#6889) by @Advayp in #6889
Add test to ensure tools are injected correctly in the Responses API (#6886) by @Advayp in #6886
Enable embeddings for append streams (#6878) by @sgrebnov in #6878
Show correct limit for EXPLAIN plans in S3VectorsQueryExec (#6852) by @Jeadie in #6852
Responses API support for Azure Open AI (#6879) by @Advayp in #6879
fix: Update search test case structure (#6865) by @peasee in #6865
Fix mongodb benchmark (#6883) by @phillipleblanc in #6883
Support multiple column primary keys for S3 vectors. (#6775) by @Jeadie in #6775
Kafka Data Connector: persist consumer between restarts (#6870) by @sgrebnov in #6870
Fix newlines in errors added in recent PRs (#6877) by @phillipleblanc in #6877
Add override parameter to force support for the Responses API (#6871) by @Advayp in #6871
Don't use metadata columns in VectorScanTableProvider (#6854) by @Jeadie in #6854
Add non-streaming tool call support (hosted and Spice tools) via the Responses API (#6869) by @Advayp in #6869
Update error guideline to remove newlines + remove newlines from error messages. (#6866) by @phillipleblanc in #6866
Remove void acceleration engine + optional table behaviors (#6868) by @phillipleblanc in #6868
Kafka Data Connector basic support (#6856) by @sgrebnov in #6856
Federated+Accelerated TPCH Benchmarks for MongoDB (#6788) by @krinart in #6788
Pass embeddings calculated in compute_index to the acceleration (#6792) by @phillipleblanc in #6792
Add non-streaming and streaming support for OpenAI Responses API endpoint (#6830) by @Advayp in #6830
Use latest version of OpenAI crate to resolve issues with Service Tier deserialization (#6853) by @Advayp in #6853
Update openapi.json (#6799) by @app/github-actions in #6799
Improve management message (#6850) by @lukekim in #6850
fix: Include FTS search column if it is the PK (#6836) by @peasee in #6836
Refactor Health Checks (#6848) by @Advayp in #6848
Introduce a Responses trait and LLM registry for model providers that support the OpenAI Responses API (#6798) by @Advayp in #6798
fix: Update datafusion-table-providers to include constraints (#6837) by @peasee in #6837
Bump postcard from 1.1.2 to 1.1.3 (#6841) by @app/dependabot in #6841
Bump governor from 0.10.0 to 0.10.1 (#6835) by @app/dependabot in #6835
Bump ctor from 0.2.9 to 0.5.0 (#6827) by @app/dependabot in #6827
Bump azure_core from 0.26.0 to 0.27.0 (#6826) by @app/dependabot in #6826
Bump rstest from 0.25.0 to 0.26.1 (#6825) by @app/dependabot in #6825
Use latest commit in our fork of async-openai (#6829) by @Advayp in #6829
Bump rustls from 0.23.27 to 0.23.31 (#6824) by @app/dependabot in #6824
Bump async-trait from 0.1.88 to 0.1.89 (#6823) by @app/dependabot in #6823
Bump hyper from 1.6.0 to 1.7.0 (#6814) by @app/dependabot in #6814
Bump serde_json from 1.0.140 to 1.0.142 (#6812) by @app/dependabot in #6812
Add s3 vector test retrieving vectors (#6786) by @Jeadie in #6786
fix: Allow v1/search with only FTS (#6811) by @peasee in #6811
Bump tantivy from 0.24.1 to 0.24.2 (#6806) by @app/dependabot in #6806
Bump tokio-util from 0.7.15 to 0.7.16 (#6810) by @app/dependabot in #6810
fix: Improve FTS index primary key handling (#6809) by @peasee in #6809
Bump logos from 0.15.0 to 0.15.1 (#6808) by @app/dependabot in #6808
Bump hf-hub from 0.4.2 to 0.4.3 (#6807) by @app/dependabot in #6807
Bump odbc-api from 13.0.1 to 13.1.0 (#6803) by @app/dependabot in #6803
fix: Spice search CLI with FTS supports string or slice unmarshalling (#6805) by @peasee in #6805
Bump uuid from 1.17.0 to 1.18.0 (#6797) by @app/dependabot in #6797
Bump reqwest from 0.12.22 to 0.12.23 (#6796) by @app/dependabot in #6796
Bump anyhow from 1.0.98 to 1.0.99 (#6795) by @app/dependabot in #6795
Bump clap from 4.5.41 to 4.5.45 (#6794) by @app/dependabot in #6794
Respect default MAX_DECODING_MESSAGE_SIZE (100MB) in Flight API (#6802) by @sgrebnov in #6802
Fix compilation errors caused by upgrading async-openai (#6793) by @Advayp in #6793
Remove outdated vector search benchmark (replaced with testoperator) (#6791) by @sgrebnov in #6791
Handle errors in vector ingestion pipeline (#6782) by @phillipleblanc in #6782
fix: Explicitly error when chunking is defined for vector engines (#6787) by @peasee in #6787
Make VectorScanTableProvider and VectorQueryTableProvider support multi-column primary keys (#6757) by @Jeadie in #6757
Use megascience/megascience Q+A dataset for text search testing. (#6702) by @Jeadie in #6702
Flight REPL autocomplete (#6589) by @krinart in #6589
use ref: github.event.pull_request.head.sha in integration_models.yml (#6780) by @Jeadie in #6780
fix: Move search telemetry calls in UDTF to scan (#6778) by @peasee in #6778
Fix Hugging Face models and embeddings loading in Docker (#6777) by @ewgenius in #6777
feat: Migrate bedrock rate limiter (#6773) by @peasee in #6773
Run the PR checks on the DEV runners (#6769) by @phillipleblanc in #6769
feat: add OpenAI models rate controller (#6767) by @peasee in #6767
Implement MongoDB data connector (#6594) by @krinart in #6594
fix: Use head ref for concurrency group (#6770) by @peasee in #6770
fix: Run enforce pulls with spice on pull_request_target (#6768) by @peasee in #6768
feat: Add OpenAI Embeddings Rate Controller (#6764) by @peasee in #6764
Move AWS SDK credential bridge integration test to the existing AWS SDK integration test run (#6766) by @phillipleblanc in #6766
Use Spice specific errors instead of OpenAIError in embedding module (#6748) by @kczimm in #6748
Use context in Glue Catalog Provider (#6763) by @Advayp in #6763
pin cargo-deny to previous version (#6762) by @kczimm in #6762
Bump actions/download-artifact from 4 to 5 (#6720) by @app/dependabot in #6720
Upgrade dependabot dependencies (#6754) by @phillipleblanc in #6754
Set E2E Test CI models build to 90 minute timeout (#6756) by @phillipleblanc in #6756
chore: upgrade to Rust 1.87.0 (#6614) by @kczimm in #6614
feat: Add initial runtime-rate-limiter crate (#6753) by @peasee in #6753
feat: Add more embedding traces, add MiniLM MTEB spicepod (#6742) by @peasee in #6742
Update QA analytics for release (#6740) by @Advayp in #6740
Always use 'returnData: true' for s3 vector query index (#6741) by @Jeadie in #6741
feat: Add Embedding and Search anonymous telemetry (#6737) by @peasee in #6737
Add 1.5.2 to SECURITY.md (#6739) by @ewgenius in #6739
Combine the Iceberg and Object Store AWS SDK bridges into one crate (#6718) by @Advayp in #6718
Updates to v1.5.2 release notes (#6736) by @lukekim in #6736
Update end game template - move glue catalog to catalogs section (#6732) by @ewgenius in #6732
Update v1.5.2.md (#6735) by @kczimm in #6735
Add note about S3 Vectors workaround (#6734) by @phillipleblanc in #6734
feat: Avoid joining for VectorScanTableProvider if the index is sufficient (#6714) by @peasee in #6714
update changelog (#6729) by @kczimm in #6729
remove unneeded autogenerated s3 vector code (#6715) by @Jeadie in #6715
fix: Set S3 vectors default limit to 30, add more tracing (#6712) by @peasee in #6712
docs: Add Hadoop cookbook to endgame template (#6708) by @peasee in #6708
Fix testoperator append mode compilation error (#6706) by @phillipleblanc in #6706
test: Add VectorScanTableProvider snapshot tests (#6701) by @peasee in #6701
feat: Add Hadoop catalog-mode benchmark (#6684) by @peasee in #6684
Move shared AWS crates used in bridges to workspace (#6705) by @Advayp in #6705
Use installation id to group connections (#6703) by @Advayp in #6703
Add Guardrails for AWS bedrock models (#6692) by @Jeadie in #6692
Update bedrock keys for CI. (#6693) by @Jeadie in #6693
Update acknowledgements (#6690) by @app/github-actions in #6690
ROADMAP updates Aug 1, 2025 (#6667) by @lukekim in #6667
Add retry logic for OpenAI embeddings creation (#6656) by @sgrebnov in #6656
Make models E2E chat test more robust (#6657) by @sgrebnov in #6657
Update Search GH Workflow to use Test Operator (#6650) by @sgrebnov in #6650
Score and P95 latency calculation for MTEB Quora-based vector search tests in Test Operator (#6640) by @sgrebnov in #6640
Fix multiple query error being classified as an internal error (#6635) by @Advayp in #6635
Add Support for S3 Table Buckets (#6573) by krinart in #6573
set MISTRALRS_METAL_PRECOMPILE=0 for metal (#6652) by @kczimm in #6652
Vector search to push down udtf limit argument into logical sort plan (#6636) by @mach-kernel in #6636
docs: Update qa_analytics.csv (#6643) by @peasee in #6643
Update SECURITY.md (#6642) by @Jeadie in #6642
docs: Update qa_analytics.csv (#6641) by @peasee in #6641
Separate token usage (#6619) by @Advayp in #6619
Fix typo in release notes (#6634) by @Advayp in #6634
Add environment variable for org token (#6633) by @Advayp in #6633
CDC: Compute embeddings on ingest (#6612) by @mach-kernel in #6612
Add view name to view creation errors (#6611) by @lukekim in #6611
Add core logic for running MTEB Quora-based vector search tests in Test Operator (#6607) by @sgrebnov in #6607
Revert "Update generate-openapi.yml (#6584)" (#6620) by @Jeadie in #6620
Non-accelerated views should report as ready only after all dependent datasets are ready (#6617) by @sgrebnov in #6617

Spice v0.15-alpha (July 1, 2024)

July 1, 2024 · 5 min read

Luke Kim

Founder and CEO of Spice AI

The v0.15-alpha release introduces support for streaming databases changes with Change Data Capture (CDC) into accelerated tables via a new Debezium connector, configurable retry logic for data refresh, and the release of a new C# SDK to build with Spice in Dotnet.

Highlights in v0.15-alpha

Debezium data connector with Change Data Capture (CDC): Sync accelerated datasets with Debezium data sources over Kafka in real-time.
Data Refresh Retries: By default, accelerated datasets attempt to retry data refreshes on transient errors. This behavior can be configured using refresh_retry_enabled and refresh_retry_max_attempts.
C# Client SDK: A new C# Client SDK has been released for developing applications in Dotnet.

Debezium data connector with Change Data Capture (CDC)

Integrating Debezium CDC is straightforward. Get started with the Debezium CDC Sample, read more about CDC in Spice, and read the Debezium data connector documentation.

Example Spicepod using Debezium CDC:

datasets:
  - from: debezium:cdc.public.customer_addresses
    name: customer_addresses_cdc
    params:
      debezium_transport: kafka
      debezium_message_format: json
      kafka_bootstrap_servers: localhost:19092
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      refresh_mode: changes

Data Refresh Retries

Example Spicepod configuration limiting refresh retries to a maximum of 10 attempts:

datasets:
  - from: eth.blocks
    name: blocks
    acceleration:
      refresh_retry_enabled: true
      refresh_retry_max_attempts: 10
      refresh_check_interval: 30s

Breaking Changes

None.

New Contributors

@rupurt made their first contribution in https://github.com/spiceai/spiceai/pull/1791

Contributors

What's Changed

Dependencies

No major dependency updates.

Commits

Update version to 0.15.0-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1784
Update helm for v0.14.1-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1786
Run PR checks on PRs merging into feature-- branches by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1788
Enable retries for accelerated table refresh by @sgrebnov in https://github.com/spiceai/spiceai/pull/1762
enable more tpch benchmark queries as a result of decimal unparsing by @y-f-u in https://github.com/spiceai/spiceai/pull/1790
add nix flake by @rupurt in https://github.com/spiceai/spiceai/pull/1791
Support local and HF embedding models by @Jeadie in https://github.com/spiceai/spiceai/pull/1789
fix(bin/spice): Implement custom Unmarshaller for DatasetOrReference by @peasee in https://github.com/spiceai/spiceai/pull/1787
For windows, move symlink -> symlink_file. by @Jeadie in https://github.com/spiceai/spiceai/pull/1793
docs: Add PULL_REQUEST_TEMPLATE.md by @peasee in https://github.com/spiceai/spiceai/pull/1794
Fix Unsupported DataType: conversion for time predicates by @sgrebnov in https://github.com/spiceai/spiceai/pull/1795
Use incremental backoff for initial dataset registration retries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1805
Basic HTTP/S connector by @Jeadie in https://github.com/spiceai/spiceai/pull/1792
Scale support for Snowflake fixed-point numbers by @sgrebnov in https://github.com/spiceai/spiceai/pull/1804
bump datafusion federation to resolve the join query failures by @y-f-u in https://github.com/spiceai/spiceai/pull/1806
fix: Stream PostgreSQL data in by @peasee in https://github.com/spiceai/spiceai/pull/1798
Remove clippy::module_name_repetitions lint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1812
Improve Snowflake fixed-point numbers casting by @sgrebnov in https://github.com/spiceai/spiceai/pull/1809
Case insensitive secret getter by @ewgenius in https://github.com/spiceai/spiceai/pull/1813
refactor: Format TOML with Taplo by @peasee in https://github.com/spiceai/spiceai/pull/1808
feat: Update PR template, add label enforcement in PR by @peasee in https://github.com/spiceai/spiceai/pull/1815
fix bug that append may miss updates when the incremental changes are not able to be contained in one record batch by @y-f-u in https://github.com/spiceai/spiceai/pull/1817
add integration test for inner join across federated table and accelerated table by @y-f-u in https://github.com/spiceai/spiceai/pull/1811
Unify spicepod.llms into spicepod.models and refactor UX of spicepod.models by @Jeadie in https://github.com/spiceai/spiceai/pull/1818
Fix issue with querying accelerated tables where the dataset name has a schema by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1823
Fix schema support for refresh_sql and improve e2e tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/1826
feat: Add GraphQL unnesting by @peasee in https://github.com/spiceai/spiceai/pull/1822
fix: Allow kind/optimization labels, increase Postgres test timeout by @peasee in https://github.com/spiceai/spiceai/pull/1830
Implement Real-time acceleration updates via Debezium CDC by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1832
Remove println statement from PG Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/1835
Don't try to "hot reload" Debezium accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1837
Create v1/search that performs vector search. by @Jeadie in https://github.com/spiceai/spiceai/pull/1836
Align spicepod UX of embeddings with models by @Jeadie in https://github.com/spiceai/spiceai/pull/1829
Add "cmake-build" feature to rdkafka for windows by @Jeadie in https://github.com/spiceai/spiceai/pull/1840
Add a better error message when trying to configure refresh_mode=changes on a data connector that doesn't support it. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1839

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.14.1-alpha...v0.15.0-alpha

Resources

Community

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Twitter: @spice_ai
Slack: spiceai.org/slack
Telegram: Spice AI Discussion
Reddit: https://www.reddit.com/r/spiceai
Email: hey@spice.ai

What's New in v2.0.0-rc.5​

Cayenne Improvements​

Mutual TLS (mTLS)​

MongoDB Change Streams​

CDC Improvements​

PostgreSQL DML Support​

Snowflake DML Support​

Arrow Primary Key Upserts​

DuckLake Promoted to Beta​

User-Defined Functions​

Spatial SQL UDFs​

On-Demand Dataset Loading​

Unified Query Cancellation​

Dynamic HTTP Connector​

HTTP Rate-Control Persistence​

refresh_mode: snapshot​

Storage-Profile Accelerator Tuning​

Provider-Aware LLM Prompt Caching​

Responses API Improvements​

Distributed Cluster Improvements​

Caching & Search​

Security Improvements​

SQL, Query, and Developer Experience​

Connector Bug Fixes​

Dependency Updates​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v2.0.0-rc.3​

HTTP Connector Enhancements​

Databricks and Unity Catalog Reliability Improvements​

Snowflake and ADBC Improvements​

OpenTelemetry and Observability Improvements​

Dependency and Toolchain Updates​

Other Improvements​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.10.4​

Additional Improvements & Bug Fixes​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.10.2​

Tiered Caching with Localpod​

Periodic Acceleration Snapshots​

DynamoDB JSON Nesting​

Kafka/Debezium Batching​

Additional Improvements & Bug Fixes​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.6.1​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.6.0​

DataFusion v48 Highlights​

Runtime Highlights​

Contributors​

New Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

What's New in v2.0.0-rc.5

Cayenne Improvements

Mutual TLS (mTLS)

MongoDB Change Streams

CDC Improvements

PostgreSQL DML Support

Snowflake DML Support

Arrow Primary Key Upserts

DuckLake Promoted to Beta

User-Defined Functions

Spatial SQL UDFs

On-Demand Dataset Loading

Unified Query Cancellation

Dynamic HTTP Connector

HTTP Rate-Control Persistence

`refresh_mode: snapshot`

Storage-Profile Accelerator Tuning

Provider-Aware LLM Prompt Caching

Responses API Improvements

Distributed Cluster Improvements

Caching & Search

Security Improvements

SQL, Query, and Developer Experience

Connector Bug Fixes

Dependency Updates

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v2.0.0-rc.3

HTTP Connector Enhancements

Databricks and Unity Catalog Reliability Improvements

Snowflake and ADBC Improvements

OpenTelemetry and Observability Improvements

Dependency and Toolchain Updates

Other Improvements

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.10.4

Additional Improvements & Bug Fixes

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.10.2

Tiered Caching with Localpod

Periodic Acceleration Snapshots

DynamoDB JSON Nesting

Kafka/Debezium Batching

Additional Improvements & Bug Fixes

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.6.1

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.6.0

DataFusion v48 Highlights

Runtime Highlights

Contributors

New Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed