Jack Eadie

Token Plumber at Spice AI

View all authors

Spice v2.0-rc.5 (May 27, 2026)

May 27, 2026 · 30 min read

Jack Eadie

Token Plumber at Spice AI

Spice v2.0-rc.5 is now available! 🔥

v2.0.0-rc.5 is the fifth release candidate for advanced testing of v2.0, building on v2.0.0-rc.4.

This release completes the mTLS implementation across server endpoints and outbound connectors, adds MongoDB Change Streams and durable Kafka offset persistence as new CDC sources, expands DML write-back to PostgreSQL, Snowflake, and Arrow, promotes DuckLake to Beta, introduces user-defined functions, on-demand dataset loading, unified query cancellation, dynamic HTTP request headers and subquery-driven request parameters, provider-aware LLM prompt caching, and a long list of Cayenne performance improvements.

Highlights in this release candidate include:

Spice Cayenne — CDC throughput, compaction and scan caching, synchronized partition commits, join filter propagation, parallel Vortex writes, lock-free deletion caches
Mutual TLS (mTLS) — TLS cert hot-reload, public mTLS for HTTP and Flight (channel + identity modes), mTLS client certs for FlightSQL and Spice.ai connectors
MongoDB Change Streams — native real-time CDC for MongoDB, no Debezium or Kafka required
Kafka CDC offsets — offsets persisted in sidecar tables for durable, resumable Kafka CDC
PostgreSQL DML — INSERT, UPDATE, DELETE write-back on PostgreSQL datasets
Snowflake DML — INSERT, UPDATE, DELETE write-back on Snowflake datasets
Arrow Primary Key Upserts — native upsert path using primary key matching
DuckLake promoted to Beta — with INSERT support on catalog tables
User-Defined Functions — define SQL UDFs in spicepods, plus remote UDFs over HTTP (Spice.ai Enterprise)
Spatial SQL UDFs — optional geospatial UDFs (ST_*) for geometry workloads
On-Demand Dataset Loading — datasets can be deferred and loaded on first reference
Unified Query Cancellation — Ctrl-C and HTTP request cancellation propagate across all execution paths
Dynamic HTTP Connector — pass-through request headers, subquery-driven params, and JSON schema decomposition
HTTP Rate-Control persistence — rate-limit state persisted in object storage across restarts
refresh_mode: snapshot — point-in-time snapshot acceleration with SQLite/Turso WAL flushing
Storage-profile accelerator tuning — accelerators auto-tune defaults based on local SSD, EBS-class disk, or tmpfs
Provider-Aware LLM Prompt Caching — automatic prompt caching for OpenAI-compatible providers that support it
Responses API — support across all model providers with streaming response.output_text.delta, plus Authorization: Bearer header support

What's New in v2.0.0-rc.5

Cayenne Improvements

Significant performance work across Spice Cayenne-backed catalogs and accelerators.

Ingest throughput: End-to-end improvements to CDC ingest, background compaction, and a new scan-result cache for hot reads; parallel Vortex partition writes; lock-free deletion caches with bloom-prefiltered probes; background retention with CDC pipelining; SQLite metastore pool scaled to 32 for high-concurrency mutation workloads.
Data inlining: Small writes are serialized as Arrow IPC and committed directly into the Cayenne metastore (cayenne_inlined_data), bypassing the staged Vortex write path for low-latency ingest. Inline upserts atomically rewrite existing inline rows instead of emitting side delete markers, and inline data remains query-visible via an in-memory union scan with a generation-keyed decode cache. Inline rows are checkpointed to Vortex when row, segment, or byte thresholds are reached. Defaults are refresh-mode aware: inline writes are enabled by default for high-frequency caching, changes, and fast append workloads and disabled for full, snapshot, and slower append.
Query planning: Join filter propagation across equi-join keys (gated behind runtime.params.cayenne_filter_propagation), range fallback for large join filters, hot-path clone elimination, and IN-list rewrites for large filter lists.
Correctness: Synchronized partition commits across partitions, correct NULL-sentinel handling for nullable partition expressions (e.g. bucket(N, col)), Vortex panic fix on highly compressible data, and live reads through expired protected snapshots.
Catalog and platform: Refresh-mode-aware compaction defaults, rejection of non-distributed Cayenne catalog configurations, and a vendored Vortex DataFusion integration for faster iteration on the Cayenne planner.

Mutual TLS (mTLS)

Spice.ai Enterprise feature. See Enterprise Security.

Spice now supports full mutual TLS for both HTTP and Arrow Flight endpoints.

TLS cert hot-reload (#10727): The Spice runtime watches for SIGHUP and reloads TLS certificates without restarting, enabling cert rotation with zero downtime.

Public mTLS for HTTP and Flight (#10753): Two client_auth_mode values control how the server handles client certificates:

request — optional mTLS: the server requests a client cert but accepts connections without one (useful for migration windows).
required — strict mTLS: the server requires a valid client cert signed by the configured CA.

mTLS client certs for FlightSQL and Spice.ai connectors (#10764): Outbound connections from the FlightSQL and Spice.ai data connectors can now present client certificates for mutual authentication with upstream services.

Example configuration:

runtime:
  tls:
    enabled: true
    certificate_file: /etc/spice/tls/server.crt
    key_file: /etc/spice/tls/server.key
    client_auth_mode: required
    client_auth_ca_file: /etc/spice/tls/client-ca.crt

MongoDB Change Streams

MongoDB datasets configured with refresh_mode: changes now stream changes from MongoDB Change Streams into any local accelerator (#10813), providing real-time CDC without Debezium or Kafka.

Example configuration:

datasets:
  - from: mongodb:my_collection
    name: my_collection
    params:
      host: my-cluster.mongodb.net
      db: mydb
    acceleration:
      enabled: true
      engine: duckdb
      refresh_mode: changes

CDC Improvements

See Change Data Capture (CDC) for an overview of CDC in Spice.

Kafka CDC offset persistence (#10823): Kafka CDC offsets are persisted in sidecar tables for durable, resumable streams. On restart or failover, Spice resumes from the last committed offset.
Pipelined CDC ingestion (#10676): Source reads overlap with batch apply, with additional batching, envelope coalescing, and nullability propagation improvements across the apply pipeline.
Debezium schema evolution fix (#10144): Schema changes in Debezium-sourced datasets no longer break dataset initialization on reload (fixes #9782).

PostgreSQL DML Support

The PostgreSQL data connector now supports write-back via INSERT, UPDATE, and DELETE operations (#10446). Combined with the existing read-side federation, PostgreSQL-backed datasets can serve as full read/write tables. The PostgreSQL Catalog connector additionally exposes foreign-key metadata for NSQL and query planning (#10849).

Snowflake DML Support

The Snowflake data connector now supports write-back via INSERT, UPDATE, and DELETE operations (#10747), complementing its existing read capabilities.

Arrow Primary Key Upserts

Arrow-accelerated tables now support native upsert operations using primary key matching (#10749), providing efficient update-or-insert semantics for in-memory datasets.

DuckLake Promoted to Beta

The DuckLake Catalog and Data Connector are promoted to Beta quality (#10743).

DuckLake catalog tables with read_write access now support INSERT operations (#10744), enabling full read/write workflows against DuckLake-backed catalogs. The DuckLake connector also gains a series of correctness fixes for downcast, module registration, schema discovery, and S3 credentials (#10650).

User-Defined Functions

Spice now supports user-defined functions (UDFs) as a first-class spicepod component (#10571), letting you define reusable SQL functions in the spicepod or invoke remote functions over HTTP. The runtime also gains table user functions with HTTP server gating (#10675).

A security fix closes a remote-UDF SSRF vector (#10757).

Spatial SQL UDFs

Spice now ships an optional set of geospatial SQL UDFs (ST_*) for geometry workloads (#10833). The functions are gated behind a build feature and can be invoked from any SQL surface.

On-Demand Dataset Loading

Datasets can now be marked for on-demand loading (#10629). Deferred datasets are registered with a declared schema at startup (#10669) and only fully resolve when first referenced, reducing startup time and memory footprint for spicepods with many seldom-used datasets.

Spicepods also gain columns[].type and columns[].nullable (#10661) with a lenient type parser for declaring schemas inline.

Unified Query Cancellation

All query execution paths — HTTP, Flight, FlightSQL, MCP, and internal — now honour a unified cancellation signal (#10390). When a client disconnects, presses Ctrl-C in the REPL, or cancels an in-flight HTTP request, the corresponding query is cancelled end-to-end, freeing resources promptly.

Dynamic HTTP Connector

The HTTP data connector gains dynamic request headers parameterised from query predicates (#10604), subquery-driven request parameters for fan-out queries (#10636), HTTP response metadata as queryable columns via JSON schema decomposition (#10679), no-limit pagination (#10673), and shared rate-control across HTTP-based connectors using the same backend host (#10648).

HTTP Rate-Control Persistence

The HTTP rate-control state (per-endpoint throttle counters) is now persisted in object storage (#10697), ensuring rate limits survive restarts and are consistent across replicas. Rate-control metrics now use an origin label rather than the connector name for cleaner aggregation (#10689).

The metrics HTTP endpoint (/metrics) is also independently rate-limited (#10162) to prevent scraping from impacting query serving.

`refresh_mode: snapshot`

Spice.ai Enterprise feature. See Acceleration Snapshots.

A new refresh_mode: snapshot provides point-in-time snapshot acceleration (#10651), with SQLite and Turso WAL flushing and a Cayenne metastore slice integration so accelerated readers see a consistent snapshot while writes continue.

Storage-Profile Accelerator Tuning

Acceleration configs gain a new storage_profile field (#10913) with values auto (default), local_ssd, ebs, and tmpfs. Under auto, the runtime detects whether the acceleration store is backed by local SSD, EBS-class network disk, or tmpfs, and applies storage-aware defaults across DuckDB, partitioned DuckDB, SQLite, Turso, and Cayenne file-mode accelerators. Explicit per-accelerator parameters always override the profile defaults.

Provider-Aware LLM Prompt Caching

LLM calls automatically use provider-aware prompt caching (#10645) when the configured model provider supports it (e.g., Anthropic, OpenAI). System prompts and tool descriptions are marked for caching so repeated invocations within the cache window reuse the provider-side cached prefix, reducing latency and cost.

A new searchable registry mode for LLM tools (#10647) lets agents discover tools by semantic search rather than enumerating all tools in the system prompt, which scales to large tool inventories.

Responses API Improvements

The Responses API is now supported across all configured model providers (#10724). Streaming delta events via response.output_text.delta are also supported (#10828). The runtime now also accepts Authorization: Bearer headers in addition to x-api-key, bumps async-openai, and stops populating FunctionToolCall.id so OpenAI-compatible servers can assign the ID themselves (#10911).

Distributed Cluster Improvements

Spice.ai Enterprise feature. See High Availability.

Per-request executor readiness gate (#10860): /v1/ready on schedulers waits for a configurable quorum of executors before returning healthy, enabling proper rolling deployments.
Ballista S3 shuffle reads under cluster mode (#10910): The shuffle reader builds its S3 client from the executor pod's environment, matching the writer. Async queries with runtime.params.shuffle_location: s3://... now complete instead of failing with AccessDenied on shuffle fetches.
Flattened scheduler config (#10450): runtime.scheduler.partition_management.* fields are flattened directly onto runtime.scheduler and renamed under the canonical "partition assignment" terminology. See Breaking Changes.

Caching & Search

Improvements across Caching and Search:

Per-principal cache namespacing (#10702): SQL, search, and caching-accelerator caches are now namespaced per authenticated principal, so cached results never cross identity boundaries.
DuckDB HNSW vector indexes (#10695, #10674, #10668): DuckDB-accelerated views support HNSW vector indexes for vector search, vector search SQL is rewritten to activate HNSW_INDEX_SCAN, and HNSW indexes are preserved across data refresh.

Security Improvements

See Authentication and TLS for configuring Spice security.

API key timing-position leak and remote-UDF SSRF (#10757): Closed a timing-based position-disclosure leak in API key comparison and blocked SSRF via remote UDF endpoint parameters.
Configurable allowed_hosts for MCP (#10638): MCP servers can be restricted to an explicit allowlist of upstream hosts.

SQL, Query, and Developer Experience

See the SQL Reference for the full SQL surface area.

SQL REPL expanded view (#10797): Toggle \x in the REPL for a vertical key-value layout on wide result sets.
FlightSQL Substrait plan support (#10761): The Spice runtime now implements CommandStatementSubstraitPlan, enabling clients that submit plans as Substrait-encoded protobuf.
MCP auth for streamable HTTP tools (#10927): Streamable HTTP MCP tools support native authentication via mcp_auth_token and mcp_headers, both with full Spice secret expansion.
Elasticsearch FTS engine config and index lifecycle (#10672): Direct FTS engine configuration plus index lifecycle and ingestion controls for the Elasticsearch connector.
Self-hosted Spice connector (#10546): Connect Spice to another self-hosted Spice runtime as a federated source.

Connector Bug Fixes

Notable correctness fixes across the Data Connectors: DynamoDB Streams retry on transient errors (#10794) and typed-NULL handling in DML (#10511); ScyllaDB physical filter pushdown disabled to fix incorrect results (#10772); MSSQL TOP N pushdown for non-nullable sort columns (#10621); DuckLake include filter applied (#10738); DuckDB DELETE/UPDATE on full and caching refresh modes (#10632); checked arithmetic for Turso integer-millis and timestamp-to-nanosecond conversions (#10786, #10666); and Flight GetFlightInfo/DoGet schema parity (#10864). See the Changelog for the full list.

Dependency Updates

Dependency / Component	Version
DuckDB	v1.5.2
Iceberg	v0.9.1
Turso	v0.6.0
Vortex	v0.69.0

Contributors

Breaking Changes

Flattened runtime.scheduler configuration (#10450): The nested runtime.scheduler.partition_management block has been flattened and renamed to use the canonical "partition assignment" terminology. Migrate as follows:

# Before
runtime:
  scheduler:
    partition_management:
      interval: 30s
      max_assignments_per_cycle: 16
      discovery_timeout: 10s

# After
runtime:
  scheduler:
    partition_assignment_interval: 30s
    max_assignments_per_interval: 16
    partition_discovery_timeout: 10s

Cookbook Updates

No new cookbook recipes.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v2.0.0-rc.5, use one of the following methods:

CLI:

spice upgrade v2.0.0-rc.5

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.0-rc.5 image:

docker pull spiceai/spiceai:2.0.0-rc.5

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0-rc.5

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changed

Changelog

Enable DML support for PostgreSQL data connector by @phillipleblanc in #10446
feat(postgres): support inline PEM sslrootcert by @claudespice in #10578
Add foreign key metadata discovery to PostgreSQL Catalog by @sgrebnov in #10849
Add Snowflake DML support by @lukekim in #10747
Add MongoDB Change Streams support by @lukekim in #10813
Add user-defined functions by @lukekim in #10571
Add table user functions and gate HTTP servers by @lukekim in #10675
feat: add on-demand dataset loading by @phillipleblanc in #10629
feat(runtime): declared-schema deferred datasets by @phillipleblanc in #10669
feat(spicepod, runtime): add columns[].type / nullable + lenient type parser by @phillipleblanc in #10661
Replace external smb crate with internal SMB 3.1.1 client by @phillipleblanc in #10516
Add unified query cancellation across all paths by @lukekim in #10390
Add dynamic HTTP request headers by @lukekim in #10604
feat(http): Support dynamic HTTP connector request params from subqueries by @lukekim in #10636
feat(http): pass through HTTP metadata columns with JSON schema decomposition by @lukekim in #10679
Add nolimit HTTP pagination max pages by @lukekim in #10673
Add shared HTTP rate control for connectors by @lukekim in #10648
Use origin label instead of name for HTTP rate control metrics by @lukekim in #10689
fix(http): reject OR across different HTTP filter columns by @lukekim in #10625
Add provider-aware LLM prompt caching by @lukekim in #10645
Add searchable registry mode for LLM tools by @lukekim in #10647
feat: refresh_mode: snapshot + SQLite/Turso WAL flush + Cayenne metastore slice by @phillipleblanc in #10651
feat: per-principal cache namespacing for SQL/search/caching-accelerator by @lukekim in #10702
Add self-hosted Spice connector support by @phillipleblanc in #10546
Add Delta Lake Azure tenant parameter by @phillipleblanc in #10671
Support OAuth2 client credentials in 'spice cloud login' by @ewgenius in #10586
Add configurable allowed_hosts for MCP by @lukekim in #10638
fix: make Helm chart probes configurable by @peasee in #10696
Strip high-cardinality datasets dim from anonymous telemetry by @lukekim in #10711
feat(elasticsearch): direct FTS engine config + index lifecycle and ingestion controls by @lukekim in #10672
Add DuckDB HNSW vector index support for accelerated views by @sgrebnov in #10695
Rewrite DuckDB vector search SQL to activate HNSW_INDEX_SCAN by @sgrebnov in #10674
Fix DuckDB HNSW vector indexes lost after data refresh by @sgrebnov in #10668
Fix DuckDB DELETE/UPDATE on full and caching refresh mode datasets by @phillipleblanc in #10632
Fix DuckLake connector: downcast, module registration, schema discovery, and S3 credentials by @sgrebnov in #10650
Fix federation pushing denied functions inside subqueries to remote engines by @phillipleblanc in #10692
fix(caching): honour refresh_on_startup: always in caching mode by @phillipleblanc in #10594
fix(iceberg): rebuild storage factory when Hadoop catalog scheme is inferred by @sgrebnov in #10601
Pipeline CDC ingestion: overlap source reads with batch apply by @lukekim in #10676
fix: add NULL check to CDC primary key extraction by @lukekim in #10684
Properly handle nullability during CDC processing by @krinart in #10803
Flatten scheduler config and rename partition management → partition assignment by @lukekim in #10450
Improve NSQL UX and harden internal LLM tools by @lukekim in #10715
Support Responses API across model providers by @lukekim in #10724
Update xAI default model and handle Grok model retirements by @Jeadie in #10723
Improve cli table layout by @krinart in #10725
TLS cert hot-reload (mTLS plan M1) by @phillipleblanc in #10727
Fix DuckLake catalog include filter being ignored by @phillipleblanc in #10738
Promote DuckLake Catalog and Data Connector to Beta quality by @sgrebnov in #10743
feat(ducklake): Support INSERT on catalog tables with read_write access by @sgrebnov in #10744
perf(cdc): coalesce envelopes and overlap commits in apply pipeline by @lukekim in #10745
feat: Allow full version tags in spicepod version by @peasee in #10748
Add Arrow primary key upserts by @lukekim in #10749
fix(snapshot): keep refresh_mode snapshot read-only by @phillipleblanc in #10752
feat(tls): public mTLS for HTTP and Flight (channel + identity modes) by @phillipleblanc in #10753
perf(cayenne): lock-free deletion caches with bloom-prefiltered probe by @lukekim in #10756
fix(security): close API key timing-position leak and remote-UDF SSRF by @lukekim in #10757
Fix 'wait_until_dependent_tables_are_ready' for catalogs by @phillipleblanc in #10758
Fixes for views and resolved tables on 'spice refresh' CLI by @phillipleblanc in #10759
Implement FlightSQL CommandStatementSubstraitPlan support by @lukekim in #10761
feat(connectors): mTLS client cert support for flightsql and spiceai connectors by @phillipleblanc in #10764
Allow arbitrary filenames when specifying spicepod path + kind validation by @krinart in #10777
fix: ignore field metadata in schema compatibility check in index_table_scan by @Jeadie in #10778
Display pushed-down limits in EXPLAIN TREE output by @lukekim in #10779
fix: enable streaming append for Kafka with Cayenne accelerator by @lukekim in #10780
fix: bound chunked-index intermediate batch size to prevent OOM by @phillipleblanc in #10783
fix: label all columns in spice cloud metrics table output by @claudespice in #10784
fix: use checked arithmetic for Turso integer-millis timestamp read path by @claudespice in #10786
fix: use checked arithmetic in timestamp-to-nanosecond conversions by @claudespice in #10666
Upgrade to DuckDB v1.5.2 by @sgrebnov in #10788
Improve CDC ingestion performance by @lukekim in #10789
Fix tool_search/tool_invoke spans by @lukekim in #10791
Add Cayenne inline mutations and benchmark coverage by @lukekim in #10792
Ensure we always resolve table names in distributed mode/metadata by @Jeadie in #10793
Remove permanent errors from DynamoDB Streams by @krinart in #10794
Add expanded view mode for wide table display in SQL REPL by @lukekim in #10797
Fix Cayenne CDC schema mismatch error by @sgrebnov in #10800
Executors should create catalog tables on join by @Jeadie in #10807
Add compressed file support for listing connectors by @lukekim in #10809
Improve Cayenne mutation, scan, and inline memtable scaling by @lukekim in #10811
Add range fallback for large join filters by @lukekim in #10816
Improve Cayenne join filter pushdown by @lukekim in #10818
Synchronize Cayenne partition commits across partitions by @phillipleblanc in #10819
fix: Deny nondistributed cayenne catalog by @peasee in #10821
Enable parallel Cayenne Vortex writes by @lukekim in #10822
Expand Arrow type handling in formatting and Elasticsearch by @lukekim in #10825
Add response.output_text.delta to responses API by @krinart in #10828
feat(cayenne): add join filter propagation and no-spill Q21 planning by @lukekim in #10840
Upgrade Turso to v0.6.0 by @sgrebnov in #10843
feat(cli): add spice feedback command to open community Slack by @lukekim in #10856
Upgrade iceberg to v0.9.1 by @sgrebnov in #10859
feat(cluster): per-request executor readiness gate on /v1/ready by @phillipleblanc in #10860
fix: Require dim-side statistics for CayennePropagateFilterAcrossEquiJoinKeys by @sgrebnov in #10863
fix: Debezium schema evolution breaks dataset init on reload by @claudespice in #10144
fix(mssql): Push topK limit to SQL Server for non-nullable sort columns by @Jeadie in #10621
fix(ScyllaDB): disable physical filter pushdown by @sgrebnov in #10772
fix: handle typed NULLs and prevent overflow in DynamoDB DML type conversions by @krinart in #10511
fix: use InsertOp::Overwrite in DynamoDB bootstrap scan_and_overwrite_accelerator by @krinart in #10639
Improve DynamoDB Bootstrap performance by @krinart in #10616
fix: preserve field and schema metadata in Vortex type transformation by @lukekim in #10628
fix: GH connector - explicitly use AWS LC RS crypto provider for jwt by @phillipleblanc in #10619
fix: add snapshot mode guards to delete_from/update and delegate DML in SwappableTableProvider by @phillipleblanc in #10685
Persist HTTP rate-control state in object storage by @lukekim in #10697
Rate limit metrics HTTP endpoint by @lukekim in #10162
feat(geo): add optional spatial SQL UDF support by @lukekim in #10833
feat(cayenne): CDC throughput, compaction, scan caching, and benchmarks by @lukekim in #10852
fix(cayenne): fix Vortex panic on highly compressible data by @sgrebnov in #10855
fix(cayenne): Read live protected snapshots after cleanup grace period by @sgrebnov in #10901
fix: Disable Cayenne HashJoin rewriter optimizer by @sgrebnov in #10882
Fix GetFlightInfo vs DoGet Flight Schema by @krinart in #10864
fix(search): preserve column casing in /v1/search primary key plumbing by @claudespice in #10909
fix(object-store): dedupe s3 url style auto-detection log by @phillipleblanc in #10898
Improve Spice CLI manifest editing and direct command modes by @lukekim in #10815
Persist Kafka CDC offsets in sidecar tables by @lukekim in #10823
feat(task-history): record Ballista stages for distributed queries by @phillipleblanc in #10831
Add '#[deny(clippy::missing_trait_methods)]' to wrapper/delegation trait impls by @Jeadie in #10795
Optimize Cayenne catalog maintenance paths by @lukekim in #10904
Centralize DuckDB settings for accelerator by @ewgenius in #10895
deps(ballista): bump to 47e2b494 to fix S3 shuffle reads under cluster mode by @phillipleblanc in #10910
Authorization header + Bump async-openai + responses_adapter fix by @krinart in #10911
Tune accelerators by storage profile by @lukekim in #10913
feat: add dataset-level on_schema_change config by @lukekim in #10908
Handle NULL sentinel for nullable partition expressions by @Jeadie in #10880
fix: Remove Cayenne Catalog from catalog registration by @peasee in #10914
Add catalog name to foreign key metadata in postgres catalog by @Jeadie in #10917
Cayenne perf: eliminate redundant clones, PK point-lookup fanout fix, IN-list rewrite + microbench coverage by @lukekim in #10916
fix(turso-shared): retry on Turso BEGIN CONCURRENT "Write-write conflict" by @lukekim in #10946
Vendor Vortex DataFusion for Cayenne by @lukekim in #10933
perf(cayenne): background retention + enable CDC pipelining for retention-configured tables by @lukekim in #10936
feat(cayenne): scale metastore pool to 32 + vs_duckdb_scaling benches (1→128 concurrency, sqlite + turso lanes) by @lukekim in #10943
feat(mcp): support auth for streamable HTTP tools by @phillipleblanc in #10927
Explicit error if v1/search requests a table without search index by @Jeadie in #10968
Fix spicepod loading failure when directory name contains dots by @sgrebnov in #10958
Extend append tests with arrow engine configurations by @sgrebnov in #10959
Remove dataset on_schema_change Policy from rc.5 release notes by @sgrebnov in #10964
Skip tpcds_q78 for Cayenne engine at SF100 by @sgrebnov in #10966
fix: Update benchmark snapshots May-20 by @app/github-actions in #10952
Fix #10951: UdtfExec invariant Vec lengths must match children count by @phillipleblanc in #10953
docs(release): update v2.0.0-rc.5 notes with latest trunk PRs by @lukekim in #10949
Remove eval related things for v2.0.0 by @Jeadie in #10945
build(deps): bump ubuntu from 24.04 to 26.04 in the docker-dependencies group by @app/dependabot in #10883
fix: Add publish = false to chbench-driver by @sgrebnov in #10939

Full Changelog: https://github.com/spiceai/spiceai/compare/v2.0.0-rc.4...v2.0.0-rc.5

Spice v1.11.1 (Feb 10, 2026)

February 10, 2026 · 4 min read

Jack Eadie

Token Plumber at Spice AI

Announcing the release of Spice v1.11.1! 🛠️

v1.11.1 is a patch release improving Spice Cayenne accelerator reliability and performance, enhancing DynamoDB Streams and HTTP data connectors, and fixing issues in Federated Task History and FlightSQL.

What's New in v1.11.1

Spice Cayenne Accelerator Improvements

This release includes stability and performance fixes for the Spice Cayenne accelerator:

Row-based Deletion Logic: Refactored row-based delete operations to use per-file deletion vectors with RoaringBitmap. Deletion scans now use Vortex-native streaming with filter pushdown and project only row indices, achieving zero data I/O for delete operations.
Constraints & On Conflict: constraints and on_conflict configurations are now automatically inferred from federated table metadata, enabling datasets like DynamoDB to work without explicitly defining primary_key in the Spicepod.
Partitioned Table Deletion: Fixed an issue where DELETE operations on partitioned Cayenne tables failed.
Data Integrity: Fixed two issues with acceleration snapshot handling: protected snapshots are now included in conflict detection keyset scans (preventing duplicate key creation during append refresh), and snapshot cleanup no longer deletes protected snapshots.

Data Connector Improvements

DynamoDB Streams: Added automatic re-bootstrapping when the stream lag exceeds DynamoDB shard retention (24h). Configurable via the new lag_exceeds_shard_retention_behavior parameter with values error (default), ready_before_load, or ready_after_load.
HTTP Connector: HTTP responses now include a response_status column (UInt16). 4xx responses (e.g., 404 Not Found) are treated as valid queryable data and cached normally. 5xx responses are retried with backoff, returned to the user, but excluded from the cache to prevent transient server errors from polluting cached results.

Other Improvements

Reliability: Added retries for SnapshotManager operations and general snapshot reliability improvements.
Reliability: Fixed handling of timestamp precision mismatches in query result caching.
Reliability: Fixed a double projection issue in federated task history queries that caused Schema error: project index out of bounds errors in cluster mode.
Developer Experience: Added cookie middleware support to the FlightSQL data connector.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No major cookbook updates. The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.11.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.11.1 image:

docker pull spiceai/spiceai:1.11.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.1

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changed

Changelog

Cayenne: row-based delete logic improvements by @sgrebnov in #9237
Proper support for constraints/on_conflict in Cayenne Acceleration by @krinart in #9335
Retries for SnapshotManager by @krinart in #9334
fix(cayenne): Include protected snapshots in conflict detection keyset scan by @sgrebnov in #9176
fix(cayenne): Fix data loss by preserving protected snapshots during cleanup by @sgrebnov in #9182
Simplify retention filter expressions before pushdown by @sgrebnov in #9244
Fix test_retention_complex_sql by @sgrebnov in #9270
runtime: avoid double projection in federated task history by @phillipleblanc in #9326
feat(http): Return all HTTP responses as data, skip caching 5xx by @sgrebnov in #9313
Snapshots Improvements by @krinart in #9318
fix(caching): Handle timestamp precision mismatch and add more tests by @sgrebnov in #9315
DynamoDB Streams Table Rebootstrapping by @krinart in #9305
Fix Cayenne partitioned table deletion support by @sgrebnov in #9267
FlightSQL: add cookie middleware support by @phillipleblanc in #9282
Apply SchemaCastScanExec before applying changes in process_upsert_batch by @krinart in #9297

Spice v1.10.1 (Dec 15, 2025)

December 16, 2025 · 5 min read

Jack Eadie

Token Plumber at Spice AI

Announcing the release of Spice v1.10.1! 🚀

v1.10.1 is a patch release with Cayenne accelerator improvements including configurable compression strategies and improved partition ID handling, isolated refresh runtime for better query API responsiveness, and security hardening. In addition, the GO SDK, gospice v8 has been released.

What's New in v1.10.1

Cayenne Accelerator Improvements

Several improvements and bug fixes for the Cayenne data accelerator:

Compression Strategies: The new cayenne_compression_strategy parameter enables choosing between zstd for compact storage or btrblocks for encoding-efficient compression.
Improved Vortex Defaults: Aligned Cayenne to Vortex footer configuration for better compatibility.
Partition ID Handling: Improved partition ID generation to avoid potential locking race conditions.

Example spicepod.yaml configuration:

datasets:
  - from: s3://my-bucket/data.parquet
    name: my_dataset
    acceleration:
      enabled: true
      engine: cayenne
      mode: file
      params:
        cayenne_compression_strategy: zstd # or btrblocks (default)

For more details, refer to the Cayenne Data Accelerator Documentation.

Isolated Refresh Runtime

Refresh tasks now run on a separate Tokio runtime isolated from the main query API. This prevents long-running or resource-intensive refresh operations from impacting query latency and ensures the /health endpoint remains responsive during heavy refresh workloads.

Security Hardening

Multiple security improvements have been implemented:

Recursion Depth Limits: Added limits to DynamoDB and S3 Vectors integrations to prevent stack overflow from deeply nested structures, mitigating potential DoS attacks.
Spicepod Summary API: The GET /v1/spicepods endpoint now returns summarized information instead of full spicepod.yaml representations, preventing potential sensitive information leakage.

Additional Improvements & Bug Fixes

Performance: Fixed double hashing of user supplied cache keys, improving cache lookup efficiency.
Reliability: Fixed idle DynamoDB Stream handling for more stable CDC operations.
Reliability: Added warnings when multiple partitions are defined for the same table.
Performance: Eagerly drop cached records for results larger than max cache size.

Spice Go SDK v8

The Spice Go SDK has been upgraded to v8 with a cleaner API, parameterized queries, and health check methods: gospice v8.0.0.

Key Features:

Cleaner API: New Sql() and SqlWithParams() methods with more intuitive naming.
Parameterized Queries: Safe, SQL-injection-resistant queries with automatic Go-to-Arrow type inference.
Typed Parameters: Explicit type control with constructors like Decimal128Param, TimestampParam, and more.
Health Check Methods: New IsSpiceHealthy() and IsSpiceReady() methods for instance monitoring.
Upgraded Dependencies: Apache Arrow v18 and ADBC Go driver v1.3.0.

Example usage with a local Spice runtime:

import "github.com/spiceai/gospice/v8"

// Initialize client for local runtime
spice := gospice.NewSpiceClient()
defer spice.Close()

if err := spice.Init(
    gospice.WithFlightAddress("grpc://localhost:50051"),
); err != nil {
    panic(err)
}

// Parameterized query (safe from SQL injection)
reader, err := spice.SqlWithParams(
    ctx,
    "SELECT * FROM users WHERE id = $1 AND created_at > $2",
    userId,
    startTime,
)

Upgrade:

go get github.com/spiceai/gospice/v8@v8.0.0

For more details, refer to the Go SDK Documentation.

Contributors

Breaking Changes

GET /v1/spicepods no longer returns the full spicepod.yaml JSON representation. A summary is returned instead. See #8404.

Cookbook Updates

No major cookbook updates.

The Spice Cookbook includes 82+ recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.10.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.10.1 image:

docker pull spiceai/spiceai:1.10.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Changelog

Return summarized spicepods from /v1/spicepods by @phillipleblanc in #8404
DynamoDB tests and fixes by @lukekim in #8491
Use an isolated Tokio runtime for refresh tasks that is separate from the main query API by @phillipleblanc in #8504
fix: Avoid double hashing cache key by @peasee in #8511
fix: Remove unused Cayenne parameters by @peasee in #8500
feat: Support vortex zstd compressor by @peasee in #8515
Fix for idle DynamoDB Stream by @krinart in #8506
fix: Improve Cayenne errors, ID selection for table/partition creation by @peasee in #8523
Update dependencies by @phillipleblanc in #8513
Upgrade to gospice v8 by @lukekim in #8524
fix: Add recursion depth limits to prevent DoS via deeply nested data (DynamoDB + S3 Vectors) by @phillipleblanc in #8544
fix: Add warning when multiple partitions are defined for the same table by @peasee in #8540
fix: Eagerly drop cached records for results larger than max by @peasee in #8516
DDB Streams Integration Test + Memory Acceleration + Improved Warning by @krinart in #8520
fix(cluster): initialize secrets before object stores in executor by @sgrebnov in #8532
Show user-friendly error on empty DDB table by @krinart in #8586
Move 'test_projection_pushdown' to runtime-datafusion by @Jeadie in #8490
Fix stats for rewritten DistributeFileScanOptimizer plans by @mach-kernel in #8581

Spice v1.8.2 (Oct 21, 2025)

October 21, 2025 · 5 min read

Jack Eadie

Token Plumber at Spice AI

Announcing the release of Spice v1.8.2! 🔍

Spice v1.8.2 is a patch release focused on reliability, validation, performance, and bug fixes, with improvements across DuckDB acceleration, S3 Vectors, document tables, and HTTP search.

What's New in v1.8.2

Support Table Relations in `/v1/search` HTTP Endpoint

Spice now supports table relations for the additional_columns and where parameters in the /v1/search endpoint. This enables improved search for multi-dataset use cases, where filters and columns can be used on specific datasets.

Example:

curl 'http://localhost:8090/v1/search' \
    -H 'Content-Type: application/json' \
    -H 'Accept: application/json' -d '{
        "text": "hello world",
        "additional_columns": ["tbl1.foo", "tbl2.bar", "baz"],
        "where": "tbl1.foo > 100000",
        "limit": 5
    }'

In this example, search results from the tbl1 dataset will include columns foo and baz, where foo > 100000. For tbl2, columns bar and baz will be returned.

DuckDB Data Accelerator Table Partitioning & Indexing

Configurable DuckDB Index Scan: DuckDB acceleration now supports configurable duckdb_index_scan_percentage and duckdb_index_scan_max_count parameters, supporting fine-tuning of index scan behavior for improved query performance.

Example:

datasets:
  - from: postgres:my_table
    name: my_table
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      params:
        # When combined, DuckDB will use an index scan when the number of qualifying rows is less than the maximum of these two thresholds
        duckdb_index_scan_percentage: '0.10' # 10% as decimal
        duckdb_index_scan_max_count: '1000'

Hive-Style Partitioning: In file-partitioned mode, the DuckDB data accelerator uses Hive-style partitioning for more efficient file management.
Table-Based Partitioning: Spice now supports partitioning DuckDB accelerations within a single file. This approach maintains ACID guarantees for full and append mode refreshes, while optimizing resource usage and improving query performance. Configure via the partition_mode parameter:

datasets:
  - from: file:test_data.parquet
    name: test_data
    params:
      file_format: parquet
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      params:
        partition_mode: tables
      partition_by:
        - bucket(100, Field1)

S3 Vectors Reliability

Race Condition Fix: Resolved a race condition in S3 Vectors index and bucket creation. The runtime also now checks if an index or bucket exists after a ConflictException, ensuring robust error handling during index creation and improving reliability for large-scale multi-index vector search.

Document Table Improvements

Primary Key Update: Document tables now use the location column as the primary key, improving performance, consistency, and query reliability.

Additional Improvements & Bugfixes

Reliability: Improved error handling and resource checks for S3 Vectors and DuckDB acceleration.
Validation: Expanded validation for partitioning and index creation.
Performance: Optimized partition refresh and index scan logic.
Bugfix: Don't nullify DuckDB release callbacks for schemas.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No major cookbook updates.

The Spice Cookbook includes 81 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.8.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.8.2 image:

docker pull spiceai/spiceai:1.8.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Changelog

Update mongo config for benchmarks by @krinart in #7546
Configurable DuckDB duckdb_index_scan_percentage & duckdb_index_scan_max_count by @lukekim in #7551
Fix race condition in S3 Vectors index and bucket creation by @kczimm in #7577
Use 'location' as primary key for document tables by @Jeadie in #7567
Update official Docker builds to use release binaries by @phillipleblanc in #7597
Hive-style partitioning for DuckDB file mode by @kczimm in #7563
New Generate Changelog workflow by @krinart in #7562
Add support for DuckDB table-based partitioning by @sgrebnov in #7581
DuckDB table partitioning: delete partitions that no longer exist after full refresh by @sgrebnov in #7614
Rename duckdb_partition_mode to partition_mode param by @sgrebnov in #7622
Fix license issue in table-providers by @phillipleblanc in #7620
Make DuckDB table partition data write threshold configurable by @sgrebnov in #7626
fix: Don't nullify DuckDB release callbacks for schemas by @peasee in #7628
Fix integration tests by reverting the use of batch inserts w/ prepared statements by @phillipleblanc in #7630
Return TableProvider from CandidateGeneration::search by @Jeadie in #7559
Handle table relations in HTTP v1/search by @Jeadie in #7615

Spice v1.6.1 (Sep 1, 2025)

September 2, 2025 · 3 min read

Jack Eadie

Token Plumber at Spice AI

Announcing the release of Spice v1.6.1! ⚡

Spice 1.6.1 is a patch release that provides improved Kafka type inference and JSON flattening support, alongside several bug fixes.

What's New in v1.6.1

Improved Kafka Type Inference: Improve Kafka type inference by configuring the number of Kafka messages sampled during schema inference. Increasing the sample size can improve the robustness and reliability of inferred schemas, especially in cases where data contains optional fields or varying structures.

Example spicepod.yml:

dataset:
  - from: kafka:orders_events
    name: orders
    params:
      schema_infer_max_records: 100 # Default 1.

For details, see the Kafka Data Connector Documentation.

Improved Kafka JSON Support: Enable nested JSON Kafka messages to be represented in flattened JSON format for the dataset schema.

Example spicepod.yml:

dataset:
  - from: kafka:orders_events
    name: orders
    params:
      flatten_json: true # default false

For example, the object:

{
  "order_id": "a1f2c3d4-1111-2222-3333-444455556666",
  "customer": {
    "id": 101,
    "name": "Alice",
    "premium": true,
    "contact": {
      "email": "alice@example.com",
      "phone": "555-1234"
    }
  },
  "discount": 5.0,
  "shipped": false
}

With flatten_json: true the result is:

+------------------------+-----------+-------------+
| column_name            | data_type | is_nullable |
+------------------------+-----------+-------------+
| order_id               | Utf8      | YES         |
| customer.id            | Int64     | YES         |
| customer.name          | Utf8      | YES         |
| customer.premium       | Boolean   | YES         |
| customer.contact.email | Utf8      | YES         |
| customer.contact.phone | Utf8      | YES         |
| discount               | Float64   | YES         |
| shipped                | Boolean   | YES         |
+------------------------+-----------+-------------+

With flatten_json: false or ommitted the result is:

+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
| column_name | data_type                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | is_nullable |
+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
| order_id    | Utf8                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | YES         |
| customer    | Struct([Field { name: "id", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "name", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "premium", data_type: Boolean, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "contact", data_type: Struct([Field { name: "email", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "phone", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]) | YES         |
| discount    | Float64                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | YES         |
| shipped     | Boolean                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | YES         |
+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+

For details, see the Kafka Data Connector Documentation.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No new cookbook recipes added in this release.

The Spice Cookbook includes 77 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.6.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.6.1 image:

docker pull spiceai/spiceai:1.6.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Changelog

Fix metadata field issue by @Advayp in #6957
Update datafusion and datafusion-table-providers crates (#6985) by @Jeadie in #6985
Add flatten_json param support for Kafka connector (#6976) by @sgrebnov in #6976
Add schema_inference_sample_count param support for Kafka connector (#6969) by @sgrebnov in #6969
Add integration test for Kafka connector (#6965) by @sgrebnov in #6965
Skip dataset health check for IcebergTableProvider datasets by @phillipleblanc in #6995

Spice v1.5.1 (July 28, 2025)

July 29, 2025 · 5 min read

Jack Eadie

Token Plumber at Spice AI

Announcing the release of Spice v1.5.1! 🔑

Spice v1.5.1 expands the GitHub data connector to include pull-request comments, adds a configurable rate limiting for AWS Bedrock embedding models, expands partition pruning with inequality operators, and adds client-supplied cache keys for granular caching control in the HTTP and Arrow Flight SQL APIs.

What's New in v1.5.1

GitHub Data Connector Pull Request Comments: Configure GitHub pulls datasets to include comments.

Example Spicepod.yaml:

datasets:
  - from: github:github.com/spiceai/spiceai/pulls
    name: spiceai.pulls
    params:
      github_include_comments: all # 'review', 'discussion', or 'none'. Defaults to 'none'.
      github_max_comments_fetched: '25' # Defaults to 100
      # ...

For details, see the GitHub Data Connector documentation.

AWS Bedrock Embedding Models Invocation Control: Improved rate limiting control for AWS Bedrock embedding models with max_concurrent_invocations configuration.

embeddings:
  - from: bedrock:cohere.embed-english-v3
    name: cohere-embeddings
    params:
      max_concurrent_invocations: '41'
      # ...

For details, see the AWS Bedrock Embeddings Model Provider documentation.

Improved Query Partitioning: Expanded partition pruning support with additional inequality operators (e.g. >, >=, <, <=).

For details, see the Query Partitioning documentation.

Client-Supplied Cache Keys: Support for a new Spice-Cache-Key header/metadata-key in the HTTP and Arrow Flight SQL query APIs to for fine-grained client-side caching control.

Example HTTP API usage:

$ curl -vvS -XPOST http://localhost:8090/v1/sql \
-H"spice-cache-key: 1851400_20170216_north_america" \
-d "select * from scihub_journals_accessed
    where user_id = '1851400'
      and date_trunc('DAY', timestamp) = '2017-02-16'
      and city = 'New York';"

Example Response:

< HTTP/1.1 200 OK
< content-type: application/json
< x-cache: Hit from spiceai
< results-cache-status: HIT
< vary: Spice-Cache-Key
< vary: origin, access-control-request-method, access-control-request-headers
< content-length: 604
< date: Wed, 23 Jul 2025 20:26:12 GMT
<
[{
"timestamp": "2017-02-16 09:55:06",
"doi": "10.1155/2012/650929",
"ip_identifier": 1000856,
"user_id": 1851400,
"country": "United States",
"city": "New York",
"longitude": 40.7830603,
"latitude": -73.9712488
},
...
]

For details, see the Cache Control documentation.

Contributors

New Contributors

@varunguleriaCodes made their first contribution in github.com/spiceai/spiceai/pull/6383

Breaking Changes

Cookbook Updates

No new recipes added in this release.

The Spice Cookbook includes 74 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.5.1 image:

docker pull spiceai/spiceai:1.5.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency updates.

Changelog

Fix refresh via Api when dataset is already accelerated and no refresh interval is set by @sgrebnov in #6549
Add support for custom GraphQL unnesting behavior by @Advayp in #6540
Regex Update to disallow hyphens dataset names by @varunguleriaCodes in #6383
Enforce max limit on comments fetched per PR by @Advayp in #6580
Fix accelerated refresh issue by @Advayp in #6590
Enable configurations of max invocations for Bedrock models by @Advayp in #6592
Client-supplied cache keys (Spice-Cache-Key) by @mach-kernel in #6579
Improved partition pruning by @kczimm in #6582
Fix retention filter when both retention_sql and period are set by @sgrebnov in #6595
Initial support for PR comments by @Advayp in #6569
chore: Update croner by @peasee in #6547
fix databricks streaming for Claude model by @peasee in #6601
Remove FullTextUDTFAnalyzerRule and move FTS code into search crate by @jeadie in #6596
Remove download of legacy sentence transformers config by @jeadie in #6605
re-add snapshot tests by @jeadie
Embedding column config to support client-specified vector sizes by @mach-kernel in #6610
Fix mismatch in columns for the GitHub PR table type by @Advayp in #6616
bump version to 1.5.1 by @phillipleblanc
fix issues with cherry-picking by @jeadie
Add integration tests for GitHub PRs with comments by @Advayp in #6581
Add view name to view creation errors by @lukekim in #6611
CDC: Compute embeddings on ingest by @mach-kernel in #6612

Spice v1.2.2 (May 13, 2025)

May 13, 2025 · 5 min read

Jack Eadie

Token Plumber at Spice AI

Announcing the release of Spice v1.2.2! 🌟

Spice v1.2.2 introduces support for Databricks Mosaic AI model serving and embeddings, alongside the existing Databricks catalog and dataset integrations. It adds configurable service ports in the Helm chart and resolves several bugs to improve stability and performance.

Highlights in v1.2.2

Databricks Model & Embedding Provider: Spice integrates with Databricks Model Serving for models and embeddings, enabling secure access via machine-to-machine (M2M) OAuth authentication with service principal credentials. The runtime automatically refreshes tokens using databricks_client_id and databricks_client_secret, ensuring uninterrupted operation. This feature supports Databricks-hosted large language models and embedding models.

models:
  - from: databricks:databricks-llama-4-maverick
    name: llama-4-maverick
    params:
      databricks_endpoint: dbc-46470731-42e5.cloud.databricks.com
      databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID}
      databricks_client_secret: ${secrets:DATABRICKS_CLIENT_SECRET}

embeddings:
  - from: databricks:databricks-gte-large-en
    name: gte-large-en
    params:
      databricks_endpoint: dbc-42424242-4242.cloud.databricks.com
      databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID}
      databricks_client_secret: ${secrets:DATABRICKS_CLIENT_SECRET}

For detailed setup instructions, refer to the Databricks Model Provider documentation.

Configurable Helm Chart Service Ports: The Helm chart now supports custom ports for flexible network configurations for deployments. Specify non-default ports in your Helm values file.
Resolved Issues:
- MCP Nested Tool Calling: Fixed a bug preventing nested tool invocation when Spice operates as the MCP server federating to MCP clients.
- Dataset Load Concurrency: Corrected a failure to respect the dataset_load_parallelism setting during dataset loading.
- Acceleration Hot-Reload: Addressed an issue where changes to acceleration enable/disable settings were not detected during hot reload of Spicepod.yaml.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

Updated cookbooks:

Databricks Catalogs: Includes using Databricks Service Principal
Databricks: Includes using M2M auth
Python ADBC: Adds a dataset to be queried over ADBC.

The Spice Cookbook now includes 68 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.2.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.2.2 image:

docker pull spiceai/spiceai:1.2.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

- Update spark-connect-rs to override user agent string by @ewgenius in https://github.com/spiceai/spice/pull/5798
- Merge pull request by @ewgenius in https://github.com/spiceai/spice/pull/5796
- Pass the default user agent string to the Databricks Spark, Delta, and Unity clients by @ewgenius in https://github.com/spiceai/spice/pull/5717
- bump to 1.2.2 by @Jeadie in https://github.com/spiceai/spice/pull/none
- Helm chart: support for service ports overrides by @sgrebnov in https://github.com/spiceai/spice/pull/5774
- Update spice cli login command with client-id and client-secret flags for Databricks by @ewgenius in https://github.com/spiceai/spice/pull/5788
- Fix bug where setting Cache-Control: no-cache doesn't compute the cache key by @phillipleblanc in https://github.com/spiceai/spice/pull/5779
- Update to datafusion-contrib/datafusion-table-providers#336 by @phillipleblanc in https://github.com/spiceai/spice/pull/5778
- Lru cache: limit single cached record size to u32::MAX (4GB) by @sgrebnov in https://github.com/spiceai/spice/pull/5772
- Fix LLMs calling nested MCP tools by @Jeadie in https://github.com/spiceai/spice/pull/5771
- MySQL: Set the character_set_results/character_set_client/character_set_connection session variables on connection setup by @Sevenannn in https://github.com/spiceai/spice/pull/5770
- Control the parallelism of acceleration refresh datasets with runtime.dataset_load_parallelism by @phillipleblanc in https://github.com/spiceai/spice/pull/5763
- Fix Iceberg predicates not matching the Arrow type of columns read from parquet files by @phillipleblanc in https://github.com/spiceai/spice/pull/5761
- fix: Use decimal_cmp for numerical BETWEEN in SQLite by @peasee in https://github.com/spiceai/spice/pull/5760
- Support product name override in databricks user agent string by @ewgenius in https://github.com/spiceai/spice/pull/5749
- Databricks U2M Token Provider support by @ewgenius in https://github.com/spiceai/spice/pull/5747
- Remove HTTP auth from LLM config and simplify Databricks models logic by using static headers by @Jeadie in https://github.com/spiceai/spice/pull/5742
- clear plan cache when dataset updates by @kczimm in https://github.com/spiceai/spice/pull/5741
- Support Databricks M2M auth in LLMs + Embeddings by @Jeadie in https://github.com/spiceai/spice/pull/5720
- Retrieve Github App tokens in background; make TokenProvider not async by @Jeadie in https://github.com/spiceai/spice/pull/5718
- Make 'token_providers' crate by @Jeadie in https://github.com/spiceai/spice/pull/5716
- Databricks AI: Embedding models & LLM streaming by @Jeadie in https://github.com/spiceai/spice/pull/5715

See the full list of changes at: v1.2.1...v1.2.2

Spice v1.0.4 (Feb 17, 2025)

February 17, 2025 · 3 min read

Jack Eadie

Token Plumber at Spice AI

Announcing the release of Spice v1.0.4 🏎️

Spice v1.0.4 improves partition pruning for Delta Lake tables, significantly increasing scan efficiency and reducing overhead. xAI tool calling is more robust and the spice trace CLI command now provides expanded, detailed output for deeper analysis. Additionally, a bug has been fixed to correctly apply column name case-sensitivity in refresh SQL, indexes, and primary keys.

Highlights in v1.0.4

Improved Append-Based Refresh When using an append-based acceleration where the time_column format differs from the physical partition, two new dataset configuration options, time_partition_column and time_partition_format can be configured to improve partition pruning and exclude irrelevant partitions during the refreshes.

For example, when the time_column format is timestamp and the physical data partition is date such as below:

my_delta_table/
├── _delta_log/
├── date_col=2023-12-31/
├── date_col=2024-02-04/
├── date_col=2025-01-01/
└── date_col=2030-06-15/

Partition pruning can be optimized using the configuration:

datasets:
  - from: delta_lake://my_delta_table
    name: my_delta_table
    time_column: created_at # A fine-grained timestamp
    time_format: timestamp
    time_partition_column: date_col # Data is physically partitioned by `date_col`
    time_partition_format: date
sgrebnov marked this conversation as resolved.

Expanded spice trace output: The spice trace CLI command now includes additional details, such as task status, and optional flags --include-input and --include-output for detailed tracing.

Example spice trace output:

TREE                   STATUS DURATION   TASK
a97f52ccd7687e64       ✅       673.14ms ai_chat
  ├── 4eebde7b04321803 ✅         0.04ms tool_use::list_datasets
  └── 4c9049e1bf1c3500 ✅       671.91ms ai_completion

Example spice trace --include-input --include-output output:

TREE                   STATUS DURATION   TASK                    OUTPUT
a97f52ccd7687e64       ✅       673.14ms ai_chat                 The capital of New York is Albany.
  ├── 4eebde7b04321803 ✅         0.04ms tool_use::list_datasets []
  └── 4c9049e1bf1c3500 ✅       671.91ms ai_completion           [{"content":"The capital of New York is Albany.","refusal":null,"tool_calls":null,"role":"assistant","function_call":null,"audio":null}]

Contributors

@Jeadie
@peasee
@phillipleblanc
@Sevenannn
@sgrebnov
@lukekim

Breaking Changes

No breaking changes.

Cookbook Updates

No new recipes.

Upgrading

To upgrade to v1.0.4, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.4 image:

docker pull spiceai/spiceai:1.0.4

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

- Do not return underlying content of chunked embedding column by default during tool_use::document_similarity by @Jeadie in https://github.com/spiceai/spiceai/pull/4802
- Fix Snowflake Case-Sensitive Identifiers support by @sgrebnov in https://github.com/spiceai/spiceai/pull/4813
- Prepare for 1.0.4 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4801
- Add support for a time_partition_column by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4784
- Prevent the automatic normalization of refresh_sql columns to lowercase by @sgrebnov in https://github.com/spiceai/spiceai/pull/4787
- Implement partition pruning for Delta Lake tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4783
- Fix constraint verification for columns with uppercase letters by @sgrebnov in https://github.com/spiceai/spiceai/pull/4785
- Add truncate command for spice trace by @peasee in https://github.com/spiceai/spiceai/pull/4771
- Implement Cache-Control: no-cache to bypass results cache by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4763
- Prompt user to download runtime when running spice sql by @Sevenannn in https://github.com/spiceai/spiceai/pull/4747
- Add vector search tracing by @peasee in https://github.com/spiceai/spiceai/pull/4757
- Update spice trace output format by @Jeadie in https://github.com/spiceai/spiceai/pull/4750
- Fix tool call arguments in Grok messages by @Jeadie in https://github.com/spiceai/spiceai/pull/4741

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v1.0.3...v1.0.4

Spice v1.0-rc.1 (Nov 27, 2024)

November 27, 2024 · 18 min read

Jack Eadie

Token Plumber at Spice AI

Announcing the release of Spice v1.0-rc.1 🚀

Spice v1.0.0-rc.1 marks the release candidate for the first major version of Spice.ai OSS. This milestone includes key Connector and Accelerator graduations and bug fixes, positioning Spice for a stable and production-ready release.

Highlights in v1.0-rc.1

API Key Authentication: Spice now supports optional authentication for API endpoints via configurable API keys, for additional security and control over runtime access.

Example Spicepod.yml configuration:

runtime:
  auth:
    api-key:
      enabled: true
      keys:
        - ${ secrets:api_key } # Load from a secret store
        - my-api-key # Or specify directly

Usage:

HTTP API: Include the API key in the X-API-Key header.
Flight SQL: Use the API key in the Authorization header as a Bearer token.
Spice CLI: Provide the --api-key flag for CLI commands.

For more details on using API Key auth, refer to the API Auth documentation.

DuckDB Data Connector: Has graduated from Beta to Release Candidate.

Arrow and DuckDB Data Accelerators: Both have graduated from Beta to Release Candidates.

Debezium Kafka Integration: Spice now supports secure authentication and encryption options for Kafka connections when using Debezium for Change Data Capture (CDC). The previous limitation of PLAINTEXT protocol-only connections has been lifted. Spice now supports the following Kafka security configurations:

Security protocol: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL
SASL mechanisms: PLAIN, SCRAM-SHA-256, SCRAM-SHA-512

Example Spicepod.yml configuration:

datasets:
  - from: debezium:my_kafka_topic_with_debezium_changes
    name: my_dataset
    params:
      kafka_security_protocol: SASL_SSL
      kafka_sasl_mechanism: SCRAM-SHA-512
      kafka_sasl_username: kafka
      kafka_sasl_password: ${secrets:kafka_sasl_password}
      kafka_ssl_ca_location: ./certs/kafka_ca_cert.pem

Breaking changes

Model Parameters: The params.spice_tools parameter has been replaced by params.tools. Backward compatibility is maintained for existing configurations using params.spice_tools.

Dataset Accelerator State: The ready_state parameter has been moved to the dataset level.

Ready Handler Response: The response body of the /v1/ready handler has been changed from Ready (uppercase) to ready (lowercase) for consistency and adherence to standards.

Default Kafka Security for Debezium: The default Kafka kafka_security_protocol parameter for Debezium datasets has changed from PLAINTEXT to SASL_SSL, improving security by default.

Metrics Name Updates: Adjustments have been made to specific metrics for improved observability and accuracy:

Before	v1.0-rc.1
catalogs_load_error	catalog_load_errors
catalogs_status	catalog_load_state
datasets_acceleration_append_duration_ms, datasets_acceleration_load_duration_ms	dataset_acceleration_refresh_duration_ms {mode: append/full}
datasets_acceleration_last_refresh_time	dataset_acceleration_last_refresh_time_ms
datasets_acceleration_refresh_error	dataset_acceleration_refresh_errors
datasets_count	dataset_active_count
datasets_load_error	dataset_load_errors
datasets_status	dataset_load_state
datasets_unavailable_time	dataset_unavailable_time_ms
embeddings_count	embeddings_active_count
embeddings_load_error	embeddings_load_errors
embeddings_status	embeddings_load_state
flight_do_action_duration_ms, flight_do_get_get_primary_keys_duration_ms, flight_do_get_get_catalogs_duration_ms, flight_do_get_get_schemas_duration_ms, flight_do_get_get_sql_info_duration_ms, flight_do_get_table_types_duration_ms, flight_do_get_get_tables_duration_ms, flight_do_get_prepared_statement_query_duration_ms, flight_do_get_simple_duration_ms, flight_do_get_statement_query_duration_ms, flight_do_put_duration_ms, flight_handshake_request_duration_ms, flight_list_actions_duration_ms, flight_get_flight_info_request_duration_ms	flight_request_duration_ms {method: method_name, command: command_name}
flight_do_action_requests, flight_do_exchange_data_updates_sent, flight_do_exchange_requests, flight_do_put_requests, flight_do_get_requests, flight_handshake_requests, flight_list_actions_requests, flight_list_flights_requests, flight_get_flight_info_requests, flight_get_schema_requests	flight_requests {method: method_name, command: command_name}
http_requests_duration_ms	http_request_duration_ms
models_count	model_active_count
models_load_duration_ms	model_load_duration_ms
models_load_error	model_load_errors
models_status	model_load_state
tool_count	tool_active_count
tool_load_error	tool_load_errors
tools_status	tool_load_state
query_count	query_executions
query_execution_duration	query_execution_duration_ms
results_cache_hit_count	results_cache_hits
results_cache_item_count	results_cache_items_count
results_cache_max_size	results_cache_max_size_bytes
results_cache_request_count	results_cache_requests
results_cache_size	results_cache_size_bytes
secrets_stores_load_duration_ms	secrets_store_load_duration_ms
bytes_processed	query_processed_bytes
bytes_returned	query_returned_bytes
spiced_runtime_flight_server_start	runtime_flight_server_started
spiced_runtime_http_server_start	runtime_http_server_started
views_load_error	view_load_errors

Contributors

@phillipleblanc
@sgrebnov
@Jeadie
@Sevenannn
@peasee
@slyons
@barracudarin
@lukekim
@ewgenius

What's changed

- Update to next release version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3372
- Update Helm chart to v0.20.0-beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3373
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3375
- E2E: Add a test to confirm refreshing with custom `refresh-sql` via CLI by @sgrebnov in https://github.com/spiceai/spiceai/pull/3374
- Fix regression in inferring embedding model vector size for non-default models by @Jeadie in https://github.com/spiceai/spiceai/pull/3376
- add AI quickstarts to endgame by @Jeadie in https://github.com/spiceai/spiceai/pull/3378
- Remove need for `params.model_type` for most HF LLMs by @Jeadie in https://github.com/spiceai/spiceai/pull/3342
- Replace `query_duration_seconds` and `http_requests_duration_seconds` with `milliseconds` metrics by @sgrebnov in https://github.com/spiceai/spiceai/pull/3251
- Add `Extension<Runtime>` to HTTP routes to simplify tooling in NSQL. by @Jeadie in https://github.com/spiceai/spiceai/pull/3384
- Update datafusion patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/3386
- Ensure hyperparameters are obeyed in recursive chat/completion calls. by @Jeadie in https://github.com/spiceai/spiceai/pull/3395
- fix: update odbc benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/3394
- Implement traits & plumbing for pluggable HTTP Auth by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3397
- Add allow_http parameter for S3 data connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3398
- Add column field to dataset spicepod component by @Jeadie in https://github.com/spiceai/spiceai/pull/3336
- feat: add duckdb connector benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/3403
- Add integration tests for OpenAI NSQL functionality by @sgrebnov in https://github.com/spiceai/spiceai/pull/3402
- Implement optional api-key auth for the HTTP endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3405
- Add integration tests for Search API (OpenAI and HF models) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3410
- HTTP APIs: list tools, call tool by @Jeadie in https://github.com/spiceai/spiceai/pull/3404
- Implement optional api-key auth for the Flight/FlightSQL endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3412
- Adding semicolons to some TPCH queries to make sure they run on the CLI by @slyons in https://github.com/spiceai/spiceai/pull/3420
- Add GrpcAuth to protect the OpenTelemetry endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3417
- Support Kafka-native authentication and TLS connections for Debezium connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3419
- Add integration tests for Embeddings API (OpenAI and HF models) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3416
- Support base64 embedding format by @Jeadie in https://github.com/spiceai/spiceai/pull/3418
- Give local models some love by @Jeadie in https://github.com/spiceai/spiceai/pull/3425
- Have views update on `--pods-watcher-enabled` by @Jeadie in https://github.com/spiceai/spiceai/pull/3428
- Simplify running models integration tests locally by @sgrebnov in https://github.com/spiceai/spiceai/pull/3424
- Make Debezium connector MySQL compatible by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3432
- Store + load memory tooling, enable by @Jeadie in https://github.com/spiceai/spiceai/pull/3413
- Statically compile OpenSSL by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3434
- Build macOS x64 on macos-14 (Sonoma) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3435
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3443
- Bump azure_core from 0.20.0 to 0.21.0 by @dependabot in https://github.com/spiceai/spiceai/pull/3436
- Add integration tests for chat completion API (HF and OpenAI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3433
- Run Clickbench with Spice Benchmark Binary by @Sevenannn in https://github.com/spiceai/spiceai/pull/3389
- Use `datatype_is_semantically_equal` in `verify_schema` by @Sevenannn in https://github.com/spiceai/spiceai/pull/3423
- Use spiceai-large-runners to build benchmark binary by @Sevenannn in https://github.com/spiceai/spiceai/pull/3446
- Skip reqwest_retry::middleware tracing in non verbose configuration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3445
- feat: Add invalid type action handling for DuckDB by @peasee in https://github.com/spiceai/spiceai/pull/3430
- Fix benchmark: Lock poisoning issue from INSTA by @Sevenannn in https://github.com/spiceai/spiceai/pull/3457
- docs: Release DuckDB Connector RC by @peasee in https://github.com/spiceai/spiceai/pull/3459
- DR: Code Pattern For Obtaining Milliseconds-Based Duration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3460
- Improve ClickBench setup script: avoid re-downloading test data every time by @sgrebnov in https://github.com/spiceai/spiceai/pull/3463
- Fix `TableReference` quoting for MySQL by @Jeadie in https://github.com/spiceai/spiceai/pull/3461
- Tool use and model name for local models by @Jeadie in https://github.com/spiceai/spiceai/pull/3458
- `params.tools`, not `params.spice_tools`. Allow backwards compatibility to `params.spice_tools`. by @Jeadie in https://github.com/spiceai/spiceai/pull/3473
- fix: Support DuckDB boolean list by @peasee in https://github.com/spiceai/spiceai/pull/3474
- Upgrade to DataFusion 43 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3462
- Build explicit ODBC Docker image by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3476
- Promote Arrow acceleration to RC by @sgrebnov in https://github.com/spiceai/spiceai/pull/3478
- Update benchmark workflow to create PR for updating snapshot by @Sevenannn in https://github.com/spiceai/spiceai/pull/3479
- Update benchmark snapshots for spice.ai connector tpch by @github-actions in https://github.com/spiceai/spiceai/pull/3481
- Update setup-make action by @Sevenannn in https://github.com/spiceai/spiceai/pull/3488
- Option to return sql from `v1/nsql` by @Jeadie in https://github.com/spiceai/spiceai/pull/3487
- Adding scripts to run and monitor TPC-H/-DS queries at larger scale factors by @slyons in https://github.com/spiceai/spiceai/pull/3483
- Update Datafusion and Datafusion-Table-Providers patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/3489
- docs: Update Accelerator RC to specify clickbench in all modes by @peasee in https://github.com/spiceai/spiceai/pull/3490
- Add logos and marks by @lukekim in https://github.com/spiceai/spiceai/pull/3485
- Updates to repo docs by @lukekim in https://github.com/spiceai/spiceai/pull/3486
- Change `document_similarity` to return markdown, not JSON. by @Jeadie in https://github.com/spiceai/spiceai/pull/3477
- Add support for creating embeddings for Utf8View type columns by @sgrebnov in https://github.com/spiceai/spiceai/pull/3498
- Add vector search support for Utf8View type columns by @sgrebnov in https://github.com/spiceai/spiceai/pull/3500
- Update `datafusion-table-providers` version by @Jeadie in https://github.com/spiceai/spiceai/pull/3503
- Update `text-embeddings-inference` and `mistral.rs` from downstream. by @Jeadie in https://github.com/spiceai/spiceai/pull/3505
- Fix snapshot update PR push in benchmark by @Sevenannn in https://github.com/spiceai/spiceai/pull/3484
- Run FederationAnalyzerRule before ResolveGroupingFunction rule by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3508
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3509
- docs: Release DuckDB accelerator RC by @peasee in https://github.com/spiceai/spiceai/pull/3512
- Upgrade datafusion-functions-json to 0.43 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3511
- Update Datafusion Table Provider patch to fix MySQL refresh append mode by @Sevenannn in https://github.com/spiceai/spiceai/pull/3514
- Handle panics in HF API calls by @Jeadie in https://github.com/spiceai/spiceai/pull/3521
- Update Runtime metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3518
- Update Flight metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3515
- Update Results Cache metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3520
- Move `ready_state` to dataset level by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3526
- Add `--force` option to `spice upgrade` to force it to upgrade to the latest released version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3527
- Refactor runtime initialization into separate modules by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3531
- Update Anonymous telemetry metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3529
- Add Metrics naming principles and guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3516
- Update Dataset Acceleration metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3528
- Improve localpod startup to register immediately after its parent is registered by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3532
- AI/LLM integration tests: make tests more robust and verify more ai_tools by @sgrebnov in https://github.com/spiceai/spiceai/pull/3513
- Update dashboards to match new metrics names by @sgrebnov in https://github.com/spiceai/spiceai/pull/3530
- Clarify source of prefixes for data component parameters. by @Jeadie in https://github.com/spiceai/spiceai/pull/3541
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3564
- Update Spice release process to support release branches by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3525
- fix: Validate the endpoint for ABFS and S3 by @peasee in https://github.com/spiceai/spiceai/pull/3565
- Vector Search: Default to datasets with embeddings only when none are specified by @sgrebnov in https://github.com/spiceai/spiceai/pull/3575
- Lowercase the ready handler response by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3577
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3579
- Improve `spice search` error handling by @sgrebnov in https://github.com/spiceai/spiceai/pull/3571
- Load components in parallel, not concurrently by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3566
- fix: Make S3 auth parameter validation more robust: by @peasee in https://github.com/spiceai/spiceai/pull/3578
- fix: Infer if the specified file format is correct in object store by @peasee in https://github.com/spiceai/spiceai/pull/3580
- Add ability to configure CORS on the HTTP server by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3581
- fix: Handle invalid S3 auth and region better by @peasee in https://github.com/spiceai/spiceai/pull/3582
- allow setting of replicaCount to a falsy-value by @barracudarin in https://github.com/spiceai/spiceai/pull/3586
- `spice search` to default to only datasets with embeddings by @sgrebnov in https://github.com/spiceai/spiceai/pull/3588
- Run AI integration tests as part of CI by @sgrebnov in https://github.com/spiceai/spiceai/pull/3572
- Load datasets in parallel by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3585
- Run integration test on smaller runners by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3583
- Use folders for model component by @Jeadie in https://github.com/spiceai/spiceai/pull/3584
- Improve models integration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3592
- Change default task_history captured_output to `none` by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3598
- Add timeout to `/v1/datasets` APIs when app is locked by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3601
- Properly drop the read lock on the runtime app in http.start by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3603
- Make integration tests more robust on fewer cores by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3604
- refactor: First pass data connector error messages update by @peasee in https://github.com/spiceai/spiceai/pull/3602
- Add log if no datasets are configured by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3605
- Upgrade to DuckDB 1.1.3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3606
- Add E2E test for spice search and chat functionality (OpenAI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3599
- Use spiceai-runners for TPCH / TPCDS benchmark by @Sevenannn in https://github.com/spiceai/spiceai/pull/3507
- docs: Update error handling guide by @peasee in https://github.com/spiceai/spiceai/pull/3611
- Improve default description for sql tool by @Jeadie in https://github.com/spiceai/spiceai/pull/3612
- Update metric name from `query_invocations` to `query_executions` by @sgrebnov in https://github.com/spiceai/spiceai/pull/3613
- Don't provide runtime tools to health check. by @Jeadie in https://github.com/spiceai/spiceai/pull/3615
- Sort vector search results based on similarity score by @sgrebnov in https://github.com/spiceai/spiceai/pull/3620
- Allow overriding runtime configuration with `--set-runtime` CLI flags by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3619
- Some bugs by @Jeadie in https://github.com/spiceai/spiceai/pull/3621
- Improve S3 errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/3640
- Update Databricks, Delta Lake, DuckDB error messages by @Sevenannn in https://github.com/spiceai/spiceai/pull/3642
- docs: Add error message UX to beta connector criteria by @peasee in https://github.com/spiceai/spiceai/pull/3639
- feat: Make REPL identify it's waiting on a new line by @peasee in https://github.com/spiceai/spiceai/pull/3617
- Wrap Server-Sent-Events chat errors as OpenAI error events by @sgrebnov in https://github.com/spiceai/spiceai/pull/3641
- refactor: Update accelerated table errors, dataset health monitor errors by @peasee in https://github.com/spiceai/spiceai/pull/3614
- Extend `v1/datasets` api to indicate if dataset can be used in vector search by @sgrebnov in https://github.com/spiceai/spiceai/pull/3644
- feat: Unnest DataFusion errors by @peasee in https://github.com/spiceai/spiceai/pull/3646
- feat: Add RateLimited DataConnectorError by @peasee in https://github.com/spiceai/spiceai/pull/3648
- Setup nightly docker release workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/3649
- Make LLM integration tests more extensible. by @Jeadie in https://github.com/spiceai/spiceai/pull/3576
- feat: Update ODBC error messages by @peasee in https://github.com/spiceai/spiceai/pull/3651
- feat: Better tonic errors by @peasee in https://github.com/spiceai/spiceai/pull/3650
- Nightly release workflow fixes by @ewgenius in https://github.com/spiceai/spiceai/pull/3652
- Fix missing ARM64 image for nightly publish step by @ewgenius in https://github.com/spiceai/spiceai/pull/3653
- Use GitHub GraphQL rate limiting responses to rate limit requests by @lukekim in https://github.com/spiceai/spiceai/pull/3610
- Fix typo in nightly release publish step by @ewgenius in https://github.com/spiceai/spiceai/pull/3654
- Handle GitHub rate-limiting for the Rest API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3656
- Adding custom User-Agent parameters to chat, nsql and flightrepl by @slyons in https://github.com/spiceai/spiceai/pull/3609
- Remove "nightly-" prefix from tag by @ewgenius in https://github.com/spiceai/spiceai/pull/3671
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3670
- `spice search` to warn if dataset is not ready and won't be included in search by @sgrebnov in https://github.com/spiceai/spiceai/pull/3590
- Fix keyring secret store to try both prefixed & unprefixed secrets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3672
- Handle empty embeds by allowing for nulls by @Jeadie in https://github.com/spiceai/spiceai/pull/3600
- Improve github connector error by @Sevenannn in https://github.com/spiceai/spiceai/pull/3677
- Update FlightSQL error messages by @sgrebnov in https://github.com/spiceai/spiceai/pull/3676
- Update Datafusion Table Provider Patch to include error message improvements by @Sevenannn in https://github.com/spiceai/spiceai/pull/3678
- Integration tests for `llms` crate, with basic Anthropic test. by @Jeadie in https://github.com/spiceai/spiceai/pull/3647
- Allow E2E model tests to complete even if parallel platform tests failed by @sgrebnov in https://github.com/spiceai/spiceai/pull/3679
- Add Openai to llms testing by @Jeadie in https://github.com/spiceai/spiceai/pull/3680
- Fix .env in '.github/workflows/integration_llms.yml' by @Jeadie in https://github.com/spiceai/spiceai/pull/3686
- Improve error messages for spice ai connector, separate errors to different lines for DuckDB, Delta Lake, Databricks connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3643
- Add `microsoft/Phi-3-mini-4k-instruct` to llms crate testing, with `MODEL_SKIPLIST` & `MODEL_ALLOWLIST` by @Jeadie in https://github.com/spiceai/spiceai/pull/3690
- Add nightly label to spiced version in Cargo.toml by @ewgenius in https://github.com/spiceai/spiceai/pull/3691
- Disable HF in models integration tests (not supported) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3693
- Add log when CORS is enabled by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3695
- Fix nightly release workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/3698
- Correctly set nightly labels for both release and pre-release versions by @ewgenius in https://github.com/spiceai/spiceai/pull/3699
- Improve REPL error handling for multiline error messages by @sgrebnov in https://github.com/spiceai/spiceai/pull/3692
- Determine support_filter_pushdown based on Accelerator federated reader & ZeroResultsAction by @Sevenannn in https://github.com/spiceai/spiceai/pull/3694
- Fix rdfkafak duplicated version by @Sevenannn in https://github.com/spiceai/spiceai/pull/3707
- feat: Render multiline errors better in REPL by @peasee in https://github.com/spiceai/spiceai/pull/3701
- refactor: Update UnableToAttachDataConnector error message by @peasee in https://github.com/spiceai/spiceai/pull/3706
- refactor: Update errors for Alpha connectors by @peasee in https://github.com/spiceai/spiceai/pull/3705
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3704
- Implement a RequestContext that automatically propagates request details to metric dimensions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3709
- Fix acceleration in append mode with refresh_sql specified by @sgrebnov in https://github.com/spiceai/spiceai/pull/3697
- Bump github.com/stretchr/testify from 1.9.0 to 1.10.0 by @dependabot in https://github.com/spiceai/spiceai/pull/3655
- Tokenizer for OpenAI embedding models for accurate chunking by @Jeadie in https://github.com/spiceai/spiceai/pull/3519
- Update error message when dataset isn't configured with time_column in append refresh by @Sevenannn in https://github.com/spiceai/spiceai/pull/3703
- Add the missing winver dependency in runtime crate by @Sevenannn in https://github.com/spiceai/spiceai/pull/3711
- deps: Update table providers by @peasee in https://github.com/spiceai/spiceai/pull/3712
- Add special tokens in chunk sizer by @Jeadie in https://github.com/spiceai/spiceai/pull/3713
- Disable results cache for benchmark tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3715

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.20.0-beta...v1.0.0-rc.1

Resources

Community

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Twitter: @spice_ai
Slack: spiceai.org/slack
Telegram: Spice AI Discussion
Reddit: https://www.reddit.com/r/spiceai
Email: hey@spice.ai

Spice v0.18.3-beta (Sep 30, 2024)

September 30, 2024 · 5 min read

Jack Eadie

Token Plumber at Spice AI

Announcing the release of Spice v0.18.3-beta 🛠️

The Spice v0.18.3-beta release includes several quality-of-life improvements including verbosity flags for spiced and the Spice CLI, vector search over larger documents with support for chunking dataset embeddings, and multiple performance enhancements. Additionally, the release includes several bug fixes, dependency updates, and optimizations, including updated table providers and significantly improved GitHub data connector performance for issues and pull requests.

Highlights in v0.18.3-beta

GitHub Query Mode: A new github_query_mode: search parameter has been added to the GitHub Data Connector, which uses the GitHub Search API to enable faster and more efficient query of issues and pull requests when using filters.

Example spicepod.yml:

- from: github:github.com/spiceai/spiceai/issues/trunk
  name: spiceai.issues
  params:
    github_query_mode: search # Use GitHub Search API
    github_token: ${secrets:GITHUB_TOKEN}

Output Verbosity: Higher verbosity output levels can be specified through flags for both spiced and the Spice CLI.

Example command line:

spice -v
spice --very-verbose

spiced -vv
spiced --verbose

Embedding Chunking: Chunking can be enabled and configured to preprocess input data before generating dataset embeddings. This improves the relevance and precision for larger pieces of content.

Example spicepod.yml:

- name: support_tickets
  embeddings:
    - column: conversation_history
      use: openai_embeddings
      chunking:
        enabled: true
        target_chunk_size: 128
        overlap_size: 16
        trim_whitespace: true

For details, see the Search Documentation.

Dependencies

DataFusion Table Providers: Upgraded to rev b0af91992699ecbf5adf2036a07122578f06150e.

Contributors

@Sevenannn
@peasee
@Jeadie
@sgrebnov
@phillipleblanc
@ewgenius
@slyons

What's Changed

- Update datafusion table provider patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2817
- refactor: Set max_rows_per_batch for ODBC to 4000 by @peasee in https://github.com/spiceai/spiceai/pull/2822
- Use User message for health check by @Jeadie in https://github.com/spiceai/spiceai/pull/2823
- Upgrade Helm chart (Spice v0.18.2-beta) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2820
- Add verbosity flags for spiced, spice: `-v`, `-vv`, `--verbose`, `--very-verbose`. by @Jeadie in https://github.com/spiceai/spiceai/pull/2831
- Rename `spiceai` data connector to `spice.ai` by @sgrebnov in https://github.com/spiceai/spiceai/pull/2680
- Prepare for v0.19.0-beta release (version bump) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2821
- Bump clap from 4.5.17 to 4.5.18 (#2801) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2848
- Enable "rc" feature for serde in spicepod crate by @ewgenius in https://github.com/spiceai/spiceai/pull/2851
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2852
- chore: update table providers by @peasee in https://github.com/spiceai/spiceai/pull/2858
- fix: Use GitHub search for issues in GraphQL by @peasee in https://github.com/spiceai/spiceai/pull/2845
- fix: Use GitHub search for pull_requests by @peasee in https://github.com/spiceai/spiceai/pull/2847
- Support chunking dataset embeddings by @Jeadie in https://github.com/spiceai/spiceai/pull/2854
- refactor: Update GraphQL client to be more robust for filter push down by @peasee in https://github.com/spiceai/spiceai/pull/2864
- docs: Update accelerator beta criteria by @peasee in https://github.com/spiceai/spiceai/pull/2865
- Change `BytesProcessedRule` to be an optimizer rather than an analyzer rule by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2867
- Don't run E2E or PR tests on documentation by @Jeadie in https://github.com/spiceai/spiceai/pull/2869
- Verify benchmark query results using snapshot testing (spice.ai connector) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2866
- feat: Add GraphQLOptimizer by @peasee in https://github.com/spiceai/spiceai/pull/2868
- Update quickstarts for Endgame by @Jeadie in https://github.com/spiceai/spiceai/pull/2863
- Update version to v0.18.3-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/2882
- Update DataFusion: fix coalesce, Aggregation with Window functions unparsing support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2884
- Revert "Rename `spiceai` data connector to `spice.ai`" by @sgrebnov in https://github.com/spiceai/spiceai/pull/2881
- Adding integration test for DuckDB read functions by @slyons in https://github.com/spiceai/spiceai/pull/2857
- Show more informative mysql error message by @Sevenannn in https://github.com/spiceai/spiceai/pull/2883
- Fix `no process-level CryptoProvider available` when using REPL and TLS by @sgrebnov in https://github.com/spiceai/spiceai/pull/2887
- Change UX for chunking and enable overlap_size in chunking by @Jeadie in https://github.com/spiceai/spiceai/pull/2890
- Add `log/slog` to spice CLI tool by @Jeadie in https://github.com/spiceai/spiceai/pull/2859
- feat: Add GitHub GraphQLOptimizer by @peasee in https://github.com/spiceai/spiceai/pull/2870
- Fix mysql invalid tablename error message by @Sevenannn in https://github.com/spiceai/spiceai/pull/2896
- fix: Remove login column rename in pulls and update Optimizer by @peasee in https://github.com/spiceai/spiceai/pull/2897
- Fix require check checking. by @Jeadie in https://github.com/spiceai/spiceai/pull/2898

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.18.2-beta...v0.18.3-beta

Resources

Community

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Twitter: @spice_ai
Slack: spiceai.org/slack
Telegram: Spice AI Discussion
Reddit: https://www.reddit.com/r/spiceai
Email: hey@spice.ai

What's New in v2.0.0-rc.5​

Cayenne Improvements​

Mutual TLS (mTLS)​

MongoDB Change Streams​

CDC Improvements​

PostgreSQL DML Support​

Snowflake DML Support​

Arrow Primary Key Upserts​

DuckLake Promoted to Beta​

User-Defined Functions​

Spatial SQL UDFs​

On-Demand Dataset Loading​

Unified Query Cancellation​

Dynamic HTTP Connector​

HTTP Rate-Control Persistence​

refresh_mode: snapshot​

Storage-Profile Accelerator Tuning​

Provider-Aware LLM Prompt Caching​

Responses API Improvements​

Distributed Cluster Improvements​

Caching & Search​

Security Improvements​

SQL, Query, and Developer Experience​

Connector Bug Fixes​

Dependency Updates​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.11.1​

Spice Cayenne Accelerator Improvements​

Data Connector Improvements​

Other Improvements​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.10.1​

Cayenne Accelerator Improvements​

Isolated Refresh Runtime​

Security Hardening​

Additional Improvements & Bug Fixes​

Spice Go SDK v8​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.8.2​

Support Table Relations in /v1/search HTTP Endpoint​

DuckDB Data Accelerator Table Partitioning & Indexing​

S3 Vectors Reliability​

Document Table Improvements​

Additional Improvements & Bugfixes​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.6.1​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.5.1​

Contributors​

New Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Dependencies​

What's New in v2.0.0-rc.5

Cayenne Improvements

Mutual TLS (mTLS)

MongoDB Change Streams

CDC Improvements

PostgreSQL DML Support

Snowflake DML Support

Arrow Primary Key Upserts

DuckLake Promoted to Beta

User-Defined Functions

Spatial SQL UDFs

On-Demand Dataset Loading

Unified Query Cancellation

Dynamic HTTP Connector

HTTP Rate-Control Persistence

`refresh_mode: snapshot`

Storage-Profile Accelerator Tuning

Provider-Aware LLM Prompt Caching

Responses API Improvements

Distributed Cluster Improvements

Caching & Search

Security Improvements

SQL, Query, and Developer Experience

Connector Bug Fixes

Dependency Updates

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.11.1

Spice Cayenne Accelerator Improvements

Data Connector Improvements

Other Improvements

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.10.1

Cayenne Accelerator Improvements

Isolated Refresh Runtime

Security Hardening

Additional Improvements & Bug Fixes

Spice Go SDK v8

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.8.2

Support Table Relations in `/v1/search` HTTP Endpoint

DuckDB Data Accelerator Table Partitioning & Indexing

S3 Vectors Reliability

Document Table Improvements

Additional Improvements & Bugfixes

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.6.1

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.5.1

Contributors

New Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies