Spice.ai Features
Spice provides a set of features for building data-driven applications and AI agents. This page gives an overview of each feature area.
Data Query and Federationβ
Query Federation connects multiple data sourcesβdatabases, data warehouses, and data lakesβthrough a single SQL interface. Write one query that joins data across PostgreSQL, Snowflake, S3, and other sources. Spice pushes query operations to source databases when possible to reduce data transfer.
Data Acceleration and Cachingβ
Data Acceleration materializes remote datasets locally in memory or on disk using engines like Arrow, DuckDB, SQLite, or PostgreSQL. Accelerated datasets stay current through scheduled refreshes, append mode, or Change Data Capture (CDC). Caching stores query and search results in memory with configurable TTLs and eviction policies to avoid redundant computation.
Viewsβ
Views create virtual tables from SQL queries over other datasets, similar to database views β useful for encapsulating query logic and (when accelerated) materializing precomputed joins or aggregates.
AI and Language Modelsβ
Large Language Models provides an OpenAI-compatible API gateway for hosted models (OpenAI, Anthropic, xAI) and locally served models (Llama, Phi) with CUDA and Metal acceleration. Models can call tools to query datasets, run SQL, and retrieve schemas. Embeddings generates vector representations of text for semantic search and RAG workflows. Workers coordinate interactions between models and tools, supporting load-balancing strategies such as round-robin and fallback across multiple LLM providers.
Searchβ
Search supports three methods: vector search (semantic similarity using embeddings), full-text search (keyword matching with BM25 scoring), and hybrid search (combining both with Reciprocal Rank Fusion). All search methods are accessible through SQL UDTFs like vector_search() and text_search().
Functionsβ
Functions extend SQL with custom scalar functions declared in a Spicepod. Inline SQL bodies run in-process and can use any DataFusion built-in; remote http:// / https:// endpoints batch row inputs over JSON for delegating logic to ML models, internal services, or custom code. Every function is automatically callable from SQL and (by default) surfaced as an LLM tool.
Tool Registryβ
Tool Registry keeps per-turn token cost bounded as the runtime's tool catalog grows. It replaces individual tool definitions with searchable tool_search and tool_invoke meta-tools backed by a hybrid full-text, keyword, schema, and vector search. Applies uniformly to built-in tools, MCP tools, and Functions declared with as_tool: true β typically a ~10Γ reduction in tool-definition tokens for tool-heavy Spicepods.
Monitoring and Observabilityβ
Observability exposes Prometheus-compatible metrics, OpenTelemetry metric export, and distributed tracing with Zipkin. Integrations are available for Datadog, Grafana, and other monitoring platforms.
ποΈ Query Federation
2 items
ποΈ Data Acceleration
7 items
ποΈ Caching
Learn how to use Spice in-memory caching
ποΈ Distributed Query
Learn how to run Spice in distributed mode for larger scale queries, including the async queries API.
ποΈ Change Data Capture
4 items
ποΈ Data Ingestion
Learn how to ingest data in Spice.
ποΈ Large Language Models
7 items
ποΈ Machine Learning Models
Spice supports loading and serving ONNX models for inference, from sources including local filesystems, Hugging Face, and the Spice.ai Cloud platform.
ποΈ Embedding Datasets
Learn how to define, or augment existing datasets with embedding column(s).
ποΈ Search
4 items
ποΈ Functions
Define custom scalar and table SQL functions inline (SQL tier) or by calling remote HTTP services (Remote tier), automatically exposed as SQL functions and LLM tools.
ποΈ Semantic Model
Attach descriptions and metadata to datasets, views, and columns in Spice so LLMs, SQL functions, and humans share the same understanding of your data.
ποΈ Tool Registry
Reduce per-turn token cost and improve LLM tool selection accuracy by replacing individual tool definitions with searchable tool_search and tool_invoke meta-tools backed by hybrid full-text, keyword, schema, and vector search.
ποΈ Observability
1 item
ποΈ Web Search
Learn how Spice can perform web search
ποΈ Views
Documentation for defining Views in Spice
ποΈ Workers
Configure workers in the Spice runtime to coordinate interactions between LLMs and tools, with load-balancing, round-robin, and fallback strategies.
